Recombinant Production of Steviol Glycosides

ABSTRACT

Recombinant microorganisms, plants, and plant cells are disclosed that have been engineered to express recombinant genes encoding UDP-glycosyltransferases (UGTs). Such microorganisms, plants, or plant cells can produce steviol glycosides, e.g., Rebaudioside A and/or Rebaudioside D, which can be used as natural sweeteners in food products and dietary supplements.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Application Ser. No.61/521,084, filed Aug. 8, 2011; U.S. Application Ser. No. 61/521,203,filed Aug. 8, 2011; U.S. Application Ser. No. 61/521,051, filed Aug. 8,2011; U.S. Application Ser. No. 61/523,487, filed Aug. 15, 2011; U.S.Application Ser. No. 61/567,929, filed Dec. 7, 2011; and U.S.Application Ser. No. 61/603,639, filed Feb. 27, 2012, all of which areincorporated by reference herein in their entirety.

TECHNICAL FIELD

This disclosure relates to the recombinant production of steviolglycosides. In particular, this disclosure relates to the production ofsteviol glycosides such as rebaudioside D by recombinant hosts such asrecombinant microorganisms, plants, or plant cells. This disclosure alsoprovides compositions containing steviol glycosides. The disclosure alsorelates to tools and methods for producing terpenoids by modulating thebiosynthesis of terpenoid precursors of the squalene pathway.

BACKGROUND

Sweeteners are well known as ingredients used most commonly in the food,beverage, or confectionary industries. The sweetener can either beincorporated into a final food product during production or forstand-alone use, when appropriately diluted, as a tabletop sweetener oran at-home replacement for sugars in baking. Sweeteners include naturalsweeteners such as sucrose, high fructose corn syrup, molasses, maplesyrup, and honey and artificial sweeteners such as aspartame, saccharineand sucralose. Stevia extract is a natural sweetener that can beisolated and extracted from a perennial shrub, Stevia rebaudiana. Steviais commonly grown in South America and Asia for commercial production ofstevia extract. Stevia extract, purified to various degrees, is usedcommercially as a high intensity sweetener in foods and in blends oralone as a tabletop sweetener.

Extracts of the Stevia plant contain rebaudiosides and other steviolglycosides that contribute to the sweet flavor, although the amount ofeach glycoside often varies among different production batches. Existingcommercial products are predominantly rebaudioside A with lesser amountsof other glycosides such as rebaudioside C, D, and F. Stevia extractsmay also contain contaminants such as plant-derived compounds thatcontribute to off-flavors. These off-flavors can be more or lessproblematic depending on the food system or application of choice.Potential contaminants include pigments, lipids, proteins, phenolics,saccharides, spathulenol and other sesquiterpenes, labdane diterpencs,monotcrpencs, decanoic acid, 8,11,14-cicosatrienoic acid,2-methyloctadecane, pentacosane, octacosane, tetracosane, octadecanol,stigmasterol, β-sitosterol, α- and β-amyrin, lupeol, β-amryin acetate,pentacyclic triterpenes, centauredin, quercitin, epi-alpha-cadinol,carophyllenes and derivatives, beta-pinene, beta-sitosterol, andgibberellin.

SUMMARY

Provided herein is a recombinant host, such as a microorganism, plant,or plant cell, comprising one or more biosynthesis genes whoseexpression results in production of steviol glycosides such asrebaudioside A, rebaudioside C, rebaudioside D, rebaudioside E,rebaudioside F, or dulcoside A. In particular, EUGT11, a uridine5′-diphospho (UDP) glycosyl transferase described herein, can be usedalone or in combination with one or more other UDP glycosyl transferasessuch as UGT74G1, UGT76G1, UGT85C2, and UGT91D2e, to allow the productionand accumulation of rebaudioside D in recombinant hosts or using invitro systems. As described herein, EUGT11 has a strong 1,2-19-O-glucoseglycosylation activity, which is an important step for rebaudioside Dproduction.

Typically, stevioside and rebaudioside A are the primary compounds incommercially-produced stevia extracts. Stevioside is reported to have amore bitter and less sweet taste than rebaudioside A. The composition ofstevia extract can vary from lot to lot depending on the soil andclimate in which the plants are grown. Depending upon the sourced plant,the climate conditions, and the extraction process, the amount ofrebaudioside A in commercial preparations is reported to vary from 20 to97% of the total steviol glycoside content. Other steviol glycosides arepresent in varying amounts in stevia extracts. For example, RebaudiosideB is typically present at less than 1-2%, whereas Rebaudioside C can bepresent at levels as high as 7-15%. Rebaudioside D is typically presentin levels of 2% or less, and Rebaudioside F is typically present incompositions at 3.5% or less of the total steviol glycosides. The amountof the minor steviol glycosides affects the flavor profile of a Steviaextract. In addition, Rebaudioside D and other higher glycosylatedsteviol glycosides are thought to be higher quality sweeteners thanRebaudioside A. As such, the recombinant hosts and methods describedherein are particularly useful for producing steviol glycosidecompositions having an increased amount of Rebaudioside D for use, forexample, as a non-caloric sweetener with functional and sensoryproperties superior to those of many high-potency sweeteners.

In one aspect, this document features a recombinant host that includes arecombinant gene encoding a polypeptide having at least 80% identity tothe amino acid sequence set forth in SEQ ID NO: 152.

This document also features a recombinant host that includes arecombinant gene encoding a polypeptide having the ability to transfer asecond sugar moiety to the C-2′ of a 19-O-glucose of rubusoside. Thisdocument also features a recombinant host that includes a recombinantgene encoding a polypeptide having the ability to transfer a secondsugar moiety to the C-2′ of a 19-O-glucose of stevioside.

In another aspect, this document features a recombinant host thatincludes a recombinant gene encoding a polypeptide having the ability totransfer a second sugar moiety to the C-2′ of the 19-O-glucose ofrubusoside and to the C-2′ of the 13-O-glucose of rubusoside.

This document also features a recombinant host that includes arecombinant gene encoding a polypeptide having the ability to transfer asecond sugar moiety to the C-2′ of a 19-O-glucose of rebaudioside A toproduce rebaudioside D, wherein the catalysis rate of the polypeptide isat least 20 times faster (e.g., 25 or 30 times faster) than a 91D2epolypeptide having the amino acid sequence set forth in SEQ ID NO: 5when the reactions are performed under corresponding conditions.

In any of the recombinant hosts described herein, the polypeptide canhave at least 85% sequence identity (e.g., 90%, 95%, 98%, or 99%sequence identity) to the amino acid sequence set forth in SEQ ID NO:152. The polypeptide can have the amino acid sequence set forth in SEQID NO: 152.

Any of the hosts described herein further can include a recombinant geneencoding a UGT85C polypeptide having at least 90% identity to the aminoacid sequence set forth in SEQ ID NO:3. The UGT85C polypeptide caninclude one or more amino acid substitutions at residues 9, 10, 13, 15,21, 27, 60, 65, 71, 87, 91, 220, 243, 270, 289, 298, 334, 336, 350, 368,389, 394, 397, 418, 420, 440, 441, 444, and 471 of SEQ ID NO:3.

Any of the hosts described herein further can include a recombinant geneencoding a UGT76G polypeptide having at least 90% identity to the aminoacid sequence set forth in SEQ ID NO:7. The UGT76G polypeptide can haveone or more amino acid substitutions at residues 29, 74, 87, 91, 116,123, 125, 126, 130, 145, 192, 193, 194, 196, 198, 199, 200, 203, 204,205, 206, 207, 208, 266, 273, 274, 284, 285, 291, 330, 331, and 346 ofSEQ ID NO:7.

Any of the hosts described herein further can include a gene (e.g., arecombinant gene) encoding a UGT74G1 polypeptide.

Any of the hosts described herein further can include a gene (e.g., arecombinant gene) encoding a functional UGT91D2 polypeptide. The UGT91D2polypeptide can have at least 80% sequence identity to the amino acidsequence set forth in SEQ ID NO:5. The UGT91D2 polypeptide can have amutation at position 206, 207, or 343 of SEQ ID NO:5. The UGT91D2polypeptide also can have a mutation at positions 211 and 286 of SEQ IDNO:5 (e.g., L211M and V286A, referred to as UGT91D2e-b). The UGT91D2polypeptide can have the amino acid sequence set forth in SEQ ID NOs: 5,10, 12, 76, 78, or 95.

Any of the hosts described herein further can include one or more of

(i) a gene encoding a geranylgeranyl diphosphate synthase;

(ii) a gene encoding a bifunctional copalyl diphosphate synthase andkaurene synthase, or a gene encoding a copalyl diphosphate synthase anda gene encoding a kaurene synthase;

(iii) a gene encoding a kaurene oxidase; and

(iv) a gene encoding a steviol synthetase. Each of the genes of (i),(ii), (iii), and

(iv) can be a recombinant gene.

Any of the hosts described herein further can include one or more of

(v) a gene encoding a truncated HMG-CoA;

(vi) a gene encoding a CPR;

(vii) a gene encoding a rhamnose synthetase;

(viii) a gene encoding a UDP-glucose dehydrogenase; and

(ix) a gene encoding a UDP-glucuronic acid decarboxylase. At least oneof the genes of (i), (ii), (iii), (iv), (v), (vi), (vii), (viii), or(ix) can be a recombinant gene.

The geranylgeranyl diphosphate synthase can have greater than 90%sequence identity to one of the amino acid sequences set forth in SEQ IDNOs: 121-128. The copalyl diphosphate synthase can have greater than 90%sequence identity to the amino acid sequence set forth in SEQ ID NOs:129-131. The kaurene synthase can have greater than 90% sequenceidentity to one of the amino acid sequences set forth in SEQ ID NOs:132-135. The kaurene oxidase can have greater than 90% sequence identityto one of the amino acid sequences set forth in SEQ ID NOs: 138-141. Thesteviol synthetase can have greater than 90% sequence identity to one ofthe amino acid sequences set forth in SEQ ID NOs: 142-146.

Any of the recombinant hosts can produce at least one steviol glycosidewhen cultured under conditions in which each of the genes is expressed.The steviol glycoside can be selected from the group consisting ofrubusoside, rebaudioside A, rebaudioside B, rebaudioside C, rebaudiosideD, rebaudioside E, rebaudioside F, dulcoside A, stevioside,steviol-19-O-Glucoside, steviol-13-O-glucoside, steviol-1,2-bioside,steviol-1,3-bioside, 1,3-stevioside, as well as other rhamnosylated orxylosylated intermediates. The steviol glycoside (e.g., rebaudioside D)can accumulate to at least 1 mg/liter (e.g., at least 10 mg/liter, 20mg/liter, 100 mg/liter, 200 mg/liter, 300 mg/liter, 400 mg/liter, 500mg/liter, 600 mg/liter, or 700 mg/liter, or greater) of culture mediumwhen cultured under said conditions.

This document also features a method of producing a steviol glycoside.The method includes growing any of the hosts described herein in aculture medium, under conditions in which the genes are expressed; andrecovering the steviol glycoside produced by the host. The growing stepcan include inducing expression of one or more of the genes. The steviolglycoside can be a 13-O-1,2-diglycosylated and/or a19-O-1,2-diglycosylated steviol glycoside (e.g., stevioside, steviol 1,2bioside, rebaudioside D, or rebaudioside E). For example, the steviolglycoside can be rebaudioside D or rebaudioside E. Other examples ofsteviol glycosides can include rebaudioside A, rebaudioside B,rebaudioside C, rebaudioside F, and dulcoside A.

This document also features a recombinant host. The host includes (i) agene encoding a UGT74G1; (ii) a gene encoding a UGT85C2; (iii) a geneencoding a UGT76G1; (iv) a gene encoding a glycosyltransferase havingthe ability to transfer a second sugar moiety to the C-2′ of a19-O-glucose of rubusoside or stevioside; and (v) optionally a geneencoding a UGT91D2e, wherein at least one of the genes is a recombinantgene. In some embodiments, each of the genes is a recombinant gene. Thehost can produce at least one steviol glycoside (e.g., rebaudioside D)when cultured under conditions in which each of the genes (e.g.,recombinant genes) is expressed. The host further can include (a) a geneencoding a bifunctional copalyl diphosphate synthase and kaurenesynthase, or a gene encoding a copalyl diphosphate synthase and a geneencoding a kaurene synthase; (b) a gene encoding a kaurene oxidase; (c)a gene encoding a steviol synthetase; (d) a gene encoding ageranylgeranyl diphosphate synthase.

This document also features a steviol glycoside composition produced byany of the hosts described herein. The composition has reduced levels ofstevia plant-derived contaminants relative to a stevia extract.

In another aspect, this document features a steviol glycosidecomposition produced by any of the hosts described herein. Thecomposition has a steviol glycoside composition enriched forrebaudioside D relative to the steviol glycoside composition of awild-type Stevia plant.

In yet another aspect, this document features a method of producing asteviol glycoside composition. The method includes growing a hostdescribed herein in a culture medium, under conditions in which each ofthe genes is expressed; and recovering the steviol glycoside compositionproduced by the host (e.g., a microorganism). The composition isenriched for rebaudioside A, rebaudioside B, rebaudioside C,rebaudioside D, rebaudioside E, rebaudioside F or dulcoside A relativeto the steviol glycoside composition of a wild-type Stevia plant. Thesteviol glycoside composition produced by the host (e.g., microorganism)can have a reduced level of stevia plant-derived contaminants relativeto a stevia extract.

This document also features a method for transferring a second sugarmoiety to the C-2′ of a 19-O-glucose or the C-2′ of a 13-O-glucose in asteviol glycoside. The method includes contacting the steviol glycosidewith a EUGT11 polypeptide described herein or UGT91D2 polypeptidedescribed herein (e.g., UGT91D2e-b) and a UDP-sugar under suitablereaction conditions for the transfer of the second sugar moiety to thesteviol glycoside. The steviol glycoside can be rubusoside, wherein thesecond sugar moiety is glucose, and stevioside is produced upon transferof the second glucose moiety. The steviol glycoside can be stevioside,wherein the second sugar moiety is glucose, and Rebaudioside E isproduced upon transfer of the second glucose moiety. The steviolglycoside can be Rebaudioside A, and Rebaudioside D is produced upontransfer of the second glucose moiety.

In another embodiment of an improved downstream steviol glycosidepathway as disclosed herein, materials and methods are provided for therecombinant production of sucrose synthase, and to materials and methodsfor increasing production of UDP-glucose in a host, specifically forincreasing the availability of UDP-glucose in vivo, with the purpose ofpromoting glycosylation reactions in the cells, and methods for reducingUDP concentrations in the cells are provided.

The document also provides a recombinant host comprising one or moreexogenous nucleic acids encoding a sucrose transporter and a sucrosesynthase, wherein expression of the one or more exogenous nucleic acidswith a glucosyltransferase results in increased levels of UDP-glucose inthe host. Optionally, the one or more exogenous nucleic acids comprise aSUS1 sequence. Optionally, the SUS1 sequence is from Coffea arabica, orencodes a functional homolog of the sucrose synthase encoded by theCoffea arabica SUS1 sequence, but equally an Arabidopsis thaliana orStevia rebaudiana SUS may be used as described herein. In therecombinant host of the invention, the one or more exogenous nucleicacids may comprise a sequence encoding a polypeptide having the sequenceset forth in SEQ ID NO: 180, or an amino acid sequence at least 90percent identical thereto, and optionally the one or more exogenousnucleic acids comprise a SUC1 sequence. In one embodiment, the SUC1sequence is from Arabidopsis thaliana, or the SUC1 sequence encodes afunctional homolog of the sucrose transporter encoded by the Arabidopsisthaliana SUC1 sequence. In the recombinant host, the one or moreexogenous nucleic acids may comprise a sequence encoding a polypeptidehaving the sequence set forth in SEQ ID NO: 179, or an amino acidsequence at least 90 percent identical thereto. The recombinant host hasreduced ability to degrade external sucrose, as compared to acorresponding host that lacks the one or more exogenous nucleic acids.

The recombinant host may be a microorganism, such as a Saccharomycete,for example Saccharomyces cerevisiae. Alternatively, the microorganismis Escherichia coli. In an alternative embodiment, the recombinant hostis a plant or plant cell.

The invention also provides a method for increasing the level ofUDP-glucose and reducing the level of UDP in a cell, the methodcomprising expressing in the cell a recombinant sucrose synthasesequence and a recombinant sucrose transporter sequence, in a mediumcomprising sucrose, wherein the cell is deficient in sucrosedegradation.

The invention additionally provides a method for promoting aglycosylation reaction in a cell, comprising expressing in the cell arecombinant sucrose synthase sequence and a recombinant sucrosetransporter sequence, in a medium comprising sucrose, wherein theexpressing results in a decreased level of UDP in the cell and anincreased level of UDP-glucose in the cell, such that glycosylation inthe cell is increased.

In either method for increasing the level of UDP-glucose or promotingglycosylation, the cell may produce vanillin glucoside, resulting inincreased production of vanillin glucoside by the cell, or may producesteviol glucoside, resulting in increased production of steviolglucoside by the cell. Optionally, the SUS1 sequence is a A. thaliana,S. rebaudiana, or Coffea arabica SUS1 sequence (see e.g., FIG. 17, SEQID NOs. 175-177), or is a sequence that encodes a functional homolog ofthe sucrose synthase encoded by the A. thaliana, S. rebaudiana, orCoffea arabica SUS1 sequence. The recombinant sucrose synthase sequenceoptionally comprises a nucleic acid encoding a polypeptide having thesequence set forth in SEQ ID NO: 180, or an amino acid sequence at least90% identical thereto, wherein optionally the recombinant sucrosetransporter sequence is a SUC1 sequence, or wherein optionally the SUC1sequence is an Arabidopsis thaliana SUC1 sequence, or is a sequence thatencodes a functional homolog of the sucrose transporter encoded by theArabidopsis thaliana SUC1 sequence, or wherein optionally therecombinant sucrose transporter sequence comprises a nucleic acidencoding a polypeptide having the sequence set forth in SEQ ID NO:179,or an amino acid sequence at least 90% identical thereto. In eithermethod, the host is a microorganism, for example a Saccharomycete,optionally such as Saccharomyces cerevisiae. Or the host may beEscherichia coli. Or the host may be a plant cell.

Also provided herein is a recombinant host, such as a microorganism,comprising one or more biosynthesis genes whose expression results inproduction of diterpenoids. Such genes include a gene encoding anent-copalyl diphosphate synthase (CDPS) (EC 5.5.1.13), a gene encodingan ent-kaurene synthase, a gene encoding an ent-kaurene oxidase; or agene encoding a steviol synthetase. At least one of the genes is arecombinant gene. The host can also be a plant cell. Expression of thesegene(s) in a Stevia plant can result in increased steviol glycosidelevels in the plant. In some embodiments the recombinant host furthercomprises a plurality of copies of a recombinant gene encoding a CDPSpolypeptide (EC 5.5.1.13) lacking a chloroplast transit peptidesequence. The CDPS polypeptide can have at least 90%, 95%, 99%, or 100%identity to the truncated CDPS amino acid sequence set forth in FIG. 14.The host can further comprise a plurality of copies of a recombinantgene encoding a KAH polypeptide, e.g., a KAH polypeptide that has atleast 90%, 95%, 99%, or 100% identity to the KAH amino acid sequence setforth in FIG. 12. The host can further comprise one or more of: (i) agene encoding a geranylgeranyl diphosphate synthase; (ii) a geneencoding a ent-kaurene oxidase; and (iii) a gene encoding a ent-kaurenesynthase. The host can further comprise one or more of (iv) a geneencoding a truncated HMG-CoA; (v) a gene encoding a CPR; (vi) a geneencoding a rhamnose synthetase; (vii) a gene encoding a UDP-glucosedehydrogenase; and (viii) a gene encoding a UDP-glucuronic aciddecarboxylase. Two or more exogenous CPRs can be present, for example.The expression of one or more of such genes can be inducible. At leastone of genes (i), (ii), (iii), (iv), (v), (vi), (vii), or (viii) can bea recombinant gene, and in some cases each of the genes of (i), (ii),(iii), (iv), (v), (vi), (vii), and (viii) is a recombinant gene. Thegeranylgeranyl diphosphate synthase can have greater than 90% sequenceidentity to the amino acid sequence set forth in SEQ ID NO:127; thekaurene oxidase can have greater than 90% sequence identity to the aminoacid sequence set forth in SEQ ID NO:138; a CPR can have greater than90% sequence identity to the amino acid sequence set forth in SEQ ID NO:168; a CPR can have greater than 90% sequence identity to the amino acidsequence set forth in SEQ ID NO: 170, and a kaurene synthase can havegreater than 90% sequence identity to the amino acid sequence set forthin SEQ ID NO: 156.

In one aspect, this document features an isolated nucleic acid encodinga polypeptide having the amino acid sequence set forth in SEQ ID NO:5,wherein the polypeptide contains substitutions position 211 land 286 ofSEQ ID NO:5. For example, the polypeptide can include a methionine atposition 211 and an alanine at position 286.

In one aspect, this document features an isolated nucleic acid encodinga polypeptide having at least 80% identity (e.g., at least 85%, 90%,95%, or 99% identity) to the amino acid sequence set forth in FIG. 12C(SEQ ID NO:164). The polypeptide can have the amino acid sequence setforth in FIG. 12C.

In another aspect, this document features a nucleic acid construct thatincluded a regulatory region operably linked to a nucleic acid encodinga polypeptide having at least 80% identity (e.g., at least 85%, 90%,95%, or 99% identity) to the amino acid sequence set forth in FIG. 12C(SEQ ID NO:164). The polypeptide can have the amino acid sequence setforth in FIG. 12C.

This document also features a recombinant host that includes arecombinant gene (e.g., a plurality of copies of a recombinant gene)encoding a KAH polypeptide having at least 80% identity (e.g., at least85%, 90%, 95%, or 99% identity) to the amino acid sequence set forth inFIG. 12C. The polypeptide can have the amino acid sequence set forth inFIG. 12C. The host can be a microorganism such as a saccharomycete(e.g., Saccharomyces cerevisiae) or Escherichia coli. The host can be aplant or plant cell (e.g., a Stevia, Physcomitrella, or tobacco plant orplant cell). The Stevia plant or plant cell is a Stevia rebaudiana plantor plant cell. The recombinant host can produce steviol when culturedunder conditions in which each of the genes is expressed. Therecombinant host can further comprise a gene encoding a UGT74G1polypeptide; a gene encoding a UGT85C2 polypeptide; a gene encoding aUGT76G1 polypeptide; a gene encoding a UGT91D2 polypeptide; and/or agene encoding a EUGT11 polypeptide. Such a host can produce at least onesteviol glycoside when cultured under conditions in which each of thegenes is expressed. The steviol glycoside can be steviol-13-O-glucoside,steviol-19-O-glucoside, rubusoside, rebaudioside A, rebaudioside B,rebaudioside C, rebaudioside D, rebaudioside E, rebaudioside F, and/ordulcoside A. The recombinant host can further comprise one or more of: agene encoding a deoxyxylulose 5-phosphate synthase (DXS); a geneencoding a D-1-deoxyxylulose 5-phosphate reductoisomerase (DXR); a geneencoding a 4-diphosphocytidyl-2-C-methyl-D-erythritol synthase (CMS); agene encoding a 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase (CMK);a gene encoding a 4-diphosphocytidyl-2-C-methyl-D-erythritol2,4-cyclodiphosphate synthase (MCS); a gene encoding a1-hydroxy-2-methyl-2(E)-butenyl 4-diphosphate synthase (HDS); and a geneencoding a 1-hydroxy-2-methyl-2(E)-butenyl 4-diphosphate reductase(HDR). The recombinant host can further comprise one or more of: a geneencoding an acetoacetyl-CoA thiolase; a gene encoding a truncatedHMG-CoA reductase; a gene encoding a mevalonate kinase; a gene encodinga phosphomevalonate kinase; and a gene encoding a mevalonatepyrophosphate decarboxylase. In another aspect, this document features arecombinant host that further comprises a gene encoding an ent-kaurenesynthase (EC 4.2.3.19) and/or a gene encoding a gibberellin 20-oxidase(EC 1.14.11.12). Such a host produces gibberellin GA3 when culturedunder conditions in which each of the genes is expressed.

This document also features an isolated nucleic acid encoding a CPRpolypeptide having at least 80% sequence identity (e.g., at least 85%,90%, 95%, or 99% sequence identity) to the S. rebaudiana CPR amino acidsequence set forth in FIG. 13. In some embodiments, the polypeptide hasthe S. rebaudiana CPR amino acid sequence set forth in FIG. 13 (SEQ IDNOs: 169 and 170).

In any of the hosts described herein, expression of one or more of thegenes can be inducible.

In any of the hosts described herein, one or more genes encodingendogenous phosphatases can be deleted or disrupted such that endogenousphosphatase activity is reduced. For example, the yeast gene DPP1 and/orLPP1 can be disrupted or deleted such that the degradation of famrncsylpyrophosphatc (FPP) to farnesol is reduced and the degradation ofgcranylgeranylpyrophosphatc (GGPP)) to geranylgeraniol (GGOH) isreduced.

In another aspect, as described herein, ERG9 can be modified as definedbelow, resulting in the decreased production of squalene synthase (SQS)and an accumulation of terpenoid precursors. The precursors may or maynot be secreted into the culture medium and can in turn be used assubstrates to enzymes capable of metabolizing the terpenoid precursorsinto desired terpenoids.

Thus, in a main aspect the present invention relates to a cellcomprising a nucleic acid sequence, said nucleic acid comprising

i) a promoter sequence operably linked to

ii) a heterologous insert sequence operably linked to

iii) an open reading frame operably linked to

iv) a transcription termination signal,

wherein the heterologous insert sequence has the general formula (I):

-X₁-X₂-X₃-X₄-X₅-

wherein X₂ comprises at least 4 consecutive nucleotides beingcomplementary to, and forming a hairpin secondary structure element withat least 4 consecutive nucleotides of X₄, and

wherein X₃ is optional and if present comprises unpaired nucleotidesinvolved in forming a hairpin loop between X₂ and X₄, and

wherein X₁ and X₅ individually and optionally comprises one or morenucleotides, and

wherein the open reading frame upon expression encodes a polypeptidesequence having at least 70% identity to a squalene synthase (EC2.5.1.21) or a biologically active fragment thereof, said fragmenthaving at least 70% sequence identity to said squalane synthase in arange of overlap of at least 100 amino acids.

The cell of the present invention is useful in enhancing yield ofindustrially interesting terpenoids. Accordingly, in another aspect thepresent invention relates to a method for producing a terpenoid compoundsynthesized through the squalene pathway, in a cell culture, said methodcomprising the steps of

(a) providing the cell as defined herein above,

(b) culturing the cell of (a).

(c) recovering the terpenoid product compound.

By providing the cell comprising the genetically modified constructdefined herein above, the accumulation of terpenoid precursors isenhanced (see e.g., FIG. 20).

Thus, in another aspect, the invention relates to a method for producinga terpenoid derived from a terpenoid precursor selected from the groupconsisting of Farnesyl-pyrophosphate (FPP), Isopentenyl-pyrophosphate(IPP), Dimethylallyl-pyrophosphate (DMAPP), Geranyl-pyrophosphate (GPP)and/or Geranylgeranyl-pyrophosphate (GGPP), said method comprising:

(a) contacting said precursor with an enzyme of the squalene synthasepathway,

(b) recovering the terpenoid product.

The present invention may operate by at least partly, stericallyhindering binding of the ribosome to the RNA thus reducing thetranslation of squalene synthase.

Accordingly, in one aspect the present invention relates to a method forreducing the translation rate of a functional squalene synthase (EC2.5.1.21) said method comprising:

(a) providing the cell defined herein above,

(b) culturing the cell of (a).

Similarly, the invention in another aspect relates to a method fordecreasing turnover of farnesyl-pp to squalene, said method comprising:

(a) providing the cell defined herein above,

(b) culturing the cell of (a).

As depicted in FIG. 20, the knocking down of the ERG9 results inbuild-up of precursors to squalene synthase. Thus in one aspect, thepresent invention relates to a method for enhancing accumulation of acompound selected from the group consisting of Farnesyl-pyrophosphate,Isopentenyl-pyrophosphate, Dimethylallyl-pyrophosphate,Geranyl-pyrophosphate and Geranylgeranyl-pyrophosphate, said methodcomprising the steps of:

(a) providing the cell defined herein above, and

(b) culturing the cell of (a).

In one embodiment the invention relates to the production ofGeranylgeranyl Pyrophosphate (GGPP) as well as other terpenoids, whichcan be prepared from Geranylgeranyl Pyrophosphate (GGPP).

In this embodiment of the invention the above described decrease ofproduction of squalene synthase (SQS) may be combined with an increasein activity of Geranylgeranyl Pyrophosphate Synthase (GGPPS), whichconverts FPP to Geranylgeranyl Pyrophosphate (GGPP), leading toincreased production of GGPP.

Thus, in one embodiment the invention relates to a microbial cellcomprising a nucleic acid sequence, said nucleic acid comprising

i) a promoter sequence operably linked to

ii) a heterologous insert sequence operably linked to

iii) an open reading frame operably linked to

iv) a transcription termination signal,

wherein the heterologous insert sequence and the open reading frame areas defined herein above,

wherein said microbial cell furthermore comprises a heterologous nucleicacid encoding GGPPS operably linked to a nucleic acid sequence directingexpression of GGPPS in said cell.

In addition, the document relates to a method for producing steviol or asteviol glycoside, wherein the method comprises use of any one of theabove-mentioned microbial cells.

Any of the hosts described herein can be a microorganism (e.g., aSaccharomycete such as Saccharomyces cerevisiae, or Escherichia coli),or a plant or plant cell (e.g., a Stevia such as a Stevia rebaudiana,Physcomitrella, or tobacco plant or plant cell).

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which the invention belongs. Although methods and materialssimilar or equivalent to those described herein can be used to practicethe invention, suitable methods and materials are described below. Allpublications, patent applications, patents, and other referencesmentioned herein are incorporated by reference in their entirety. Incase of conflict, the present specification, including definitions, willcontrol. In addition, the materials, methods, and examples areillustrative only and are not intended to be limiting. Other featuresand advantages of the invention will be apparent from the followingdetailed description. Applicants reserve the right to alternativelyclaim any disclosed invention using the transitional phrase“comprising,” “consisting essentially of,” or “consisting of,” accordingto standard practice in patent law.

DESCRIPTION OF DRAWINGS

FIG. 1 is the chemical structure of various steviol glycosides.

FIGS. 2A-D show representative pathways for the biosynthesis of steviolglycosides from steviol.

FIG. 3 is a schematic representation of 19-O-1,2-diglycosylationreactions by EUGT11 and UGT91D2e. The numbers are the average signalintensity for the substrates or the products of the reaction, fromliquid chromatography-mass spectrometry (LC-MS) chromatograms

FIG. 4 contains LC-MS chromatograms showing the production ofrebaudioside D (RebD) from Rebaudioside A (RebA) using in vitrotranscribed and translated UGT91D2e (SEQ ID NO:5) (left panel) or EUGT11(SEQ ID NO:152) (right panel). The LC-MS was set to detect certainmasses corresponding to steviol+5 glucoses (such as RebD), steviol+4glucoses (such as RebA) etc. Each ‘lane’ is scaled according to thehighest peak.

FIG. 5 contains LC-MS chromatograms showing the conversion of rubusosideto stevioside and compounds ‘2’ and ‘3’ (RcbE) by UGT91D2c (left panel)and EUGT11 (right panel).

FIG. 6 is an alignment of the amino acid sequence of EUGT11 (SEQ IDNO:152, top line) with the amino acid sequence of UGT91D2e (SEQ ID NO:5,bottom line).

FIG. 7 contains the amino acid sequence of EUGT11 (SEQ ID NO:152), thenucleotide sequence (SEQ ID NO:153) encoding EUGT11, and the nucleotidesequence encoding EUGT11 that has been codon optimized for expression inyeast (SEQ ID NO: 154).

FIG. 8 is an alignment of the secondary structure predictions of UGT91D2e with UGT85H2 and UGT71G1. Secondary structure predictions were madeby subjecting the amino acid sequences of the three UGTs to NetSurfPver. 1.1—Protein Surface Accessibility and Secondary StructurePredictions, at the world wide web at cbs.dtu.dk/services/NetSurfP/.This predicted the presence and location of alpha helices, beta sheetsand coils in the proteins. These were subsequently labeled as shown forUGT91D2e. For example, the first N-terminal beta-sheet was labeled Nβ1.The y-axis represents the certainty of the prediction, the higher themore confident and the x-axis represents amino acid position. Althoughthe primary sequence identity between these UGTs is very low, thesecondary structures show a very high degree of conservation.

FIG. 9 is an alignment of the amino acid sequences of UGT91D1 andUGT91D2e (SEQ ID NO: 5).

FIG. 10 is a bar graph of the activity of double amino acid substitutionmutants of UGT91D2e. The filled bars represent stevioside production andthe open bars represent 1,2-bioside production.

FIG. 11 is a schematic representation of UDP-glucose regeneration forthe biosynthesis of steviol glycosides. SUS=sucrose synthase;Steviol=steviol or steviol glycoside substrate; UGT=UDP glycosyltransferase.

FIG. 12A is the nucleotide sequence encoding the Stevia rebaudiana KAH(SEQ ID NO: 163), designated SrKAHe1 herein.

FIG. 12B is the nucleotide sequence encoding the Stevia rebaudiana KAHe1that has been codon-optimized for expression in yeast (SEQ ID NO: 165).

FIG. 12C is the amino acid sequence of the Stevia rebaudiana KAHe1 (SEQID NO:164).

FIG. 13A contains the amino acid sequences of CPR polypeptides from S.cerevisiae (encoded by NCP1 gene) (SEQ ID NO: 166), A. thaliana (encodedby ATR1 and encoded by ATR2) ((SEQ ID NOs: 148 and 168), and S.rebaudiana (encoded by CPR7 and encoded by CPR8) (SEQ ID NOs: 169 and170).

FIG. 13B contains ATR1 nucleotide sequence (Accession No. CAA23011) thathas been codon optimized for expression in yeast (SEQ ID NO:171); ATR2nucleotide sequence that has been codon optimized for expression inyeast (SEQ ID NO:172); the Stevia rebaudiana CPR7 nucleotide sequence(SEQ ID NO:173); and the Stevia rebaudiana CPR8 nucleotide sequence (SEQID NO:174).

FIG. 14A contains the nucleotide sequence (SEQ ID NO: 157) encoding aCDPS polypeptide (SEQ ID NO:158) from Zea mays. The sequence that is inbold and underlined can be deleted to remove the sequence encoding thechloroplast transit sequence.

FIG. 14B contains the amino acid sequence of the CDPS polypeptide (SEQID NO:158) from Zea mays. The sequence that is in bold and underlinedcan be deleted to remove the chloroplast transit sequence.

FIG. 15A contains a codon-optimized nucleotide sequence (SEQ ID NO:161)encoding a bifunctional CDPS-KS polypeptide (SEQ ID NO:162) fromGibberella fujikuroi. FIG. 15B contains the amino acid sequence of thebifunctional CDPS-KS polypeptide (SEQ ID NO:162) from Gibberellafujikuroi.

FIG. 16 is a graph of the growth of two strains of S. cerevisiae,enhanced EFSC1972 (designated T2) and enhanced EFSC1972 with furtheroverexpression of the Arabidopsis thaliana kaurene synthase (KS-5)(designated T7, squares). Numbers on the y-axis are OD600 values of thecell culture, while numbers on the x-axis represent hours of growth insynthetic complete based medium at 30° C.

FIG. 17 contains the nucleic acid sequences encoding the A. thaliana, S.rebaudiana (from contig10573 selection_ORF S11E, with the mutation thatchanges S11 to glutamate (E) in bold, lowercase letters), and coffee(Coffea arabica) sucrose synthases, SEQ ID NOs:175, 176, and 177,respectively.

FIG. 18 is a bar graph of rebD production in permeabilized S.cerevisiae, which had been transformed with EUGT11 or an empty plasmid(“Empty”). Cells were grown to exponential growth phase, washed in PBSbuffer and subsequently treated with Triton X-100 (0.3% or 0.5% in PBS)30° C., 30 min. After permeabilization cells were washed in PBS andresuspended in reaction mix containing 100 μM RebA and 300 μMUDP-glucose. Reactions proceeded for 20 h, 30° C.

FIG. 19A is the amino acid sequence of the A. thalianaUDP-glycosyltransferase UGT72E2 (SEQ ID NO:178).

FIG. 19B is the amino acid sequence of the sucrose transporter SUC1 fromA. thaliana (SEQ ID NO:179).

FIG. 19C is the amino acid sequence of the sucrose synthase from coffee(SEQ ID NO:180).

FIG. 20 is a schematic of the isoprenoid pathway in yeast, showing theposition of ERG9.

FIG. 21 contains the nucleotide sequence of the Saccharomyces cerevisiaeCyc1 promoter (SEQ ID NO: 185) and Saccharomyces cerevisiae Kex2promoter (SEQ ID NO:186).

FIG. 22 is a schematic of the PCR product containing two regions, HR1and HR2, which are homologous to parts of the genome sequence within theERG9 promoter or 5′ end of the ERG9 open reading frame (ORF),respectively. Also, on the PCR product is an antibiotic marker, NatR,which can be embedded between two Lox sites (L) for subsequent excisionwith Cre recombinase. The PCR product further can include a promoter,such as either the wild type ScKex2, wild type ScCyc1, and the promoterfurther can include a heterologous insert such as a hairpin (SEQ ID NO:181-184) at its 3′-end (See FIG. 23).

FIG. 23 is a schematic of promoter and ORF with a hairpin stemloopimmediately upstream of the translation startsite (arrow) and analignment of a portion of the wild-type S. cerevisiae Cyc1 promotersequence and initial ATG of the ERG9 OPR without a hctcrologous insert(SEQ ID NO:187) and with four different heterologous inserts (SEQ IDNOs. 188-191). 75% refers to construct comprising the ScCyc1 promoterfollowed by SEQ ID NO: 184 (SEQ ID NO:191); 50% refers to constructcomprising the ScCyc1 promoter followed by SEQ ID NO: 183 (SEQ IDNO:190); 20% refers to construct comprising the ScCyc1 promoter followedby SEQ ID NO: 182 (SEQ ID NO:189); 5% refers to construct comprising theScCyc1 promoter followed by SEQ ID NO: 181 (SEQ ID NO: 188).

FIG. 24 is a bar graph showing amorphadiene produced in yeast strainswith different promoter constructs inserted in front of the ERG9 gene ofthe host genome. CTRL-ADS refers to control strain with no modification;ERG9-CYC1-100% refers to construct comprising the ScCyc1 promoter and noinsert; ERG9-CYC1-50% refers to construct comprising the ScCyc1 promoterfollowed by SEQ ID NO: 183 (SEQ ID NO:190); ERG9-CYC1-20% refers toconstruct comprising the ScCyc1 promoter followed by SEQ ID NO: 182 (SEQID NO:189); ERG9-CYC1-5% refers to construct comprising the ScCyc1promoter followed by SEQ ID NO: 181 (SEQ ID NO:188); ERG9-KEX2-100%refers to construct comprising the ScKex2 promoter.

FIG. 25 contains the amino acid sequence of squalene synthasepolypeptides from Saccharomyces cerevisiae, Schizosaccharomyces pombe,Yarrowia lipolytica, Candida glabrata, Ashbya gossypii, Cyberlindnerajadinii, Candida albicans, Saccharomyces cerevisiae, Homo sapiens, Musmusculus, and Rattus norvegicus (SEQ ID NOs: 192-202), and the aminoacid sequence of a geranylgeranyl diphosphate synthase (GGPPS) fromAspergillus nidulans and S. cerevisiae (SEQ ID NOs. 203 and 167).

FIG. 26 is a bar graph of geranylgeraniol (GGOH) accumulation in theERG9-CYC1-5% strain and ERG9-KEX2 strain after 72 hours.

FIG. 27 is a representative chromatograph showing the conversion ofrubusoside to xylosylated intermediates for RebF production by UGT91D2eand EUGT11.

Like reference symbols in the various drawings indicate like elements.

DETAILED DESCRIPTION

This document is based on the discovery that recombinant hosts such asplant cells, plants, or microorganisms can be developed that expresspolypeptides useful for the biosynthesis of steviol glycosides such asrebaudioside A, rebaudioside C, rebaudioside D, rebaudioside E,rebaudioside F, or dulcoside A. The recombinant hosts described hereinare particularly useful for producing Rebaudioside D. Such hosts canexpress one or more Uridine 5′-diphospho (UDP) glycosyl transferasessuitable for producing steviol glycosides. Expression of thesebiosynthetic polypeptides in various microbial chassis allows steviolglycosides to be produced in a consistent, reproducible manner fromenergy and carbon sources such as sugars, glycerol, CO₂, H₂, andsunlight. The proportion of each steviol glycoside produced by arecombinant host can be tailored by incorporating preselectedbiosynthetic enzymes into the hosts and expressing them at appropriatelevels, to produce a sweetener composition with a consistent tasteprofile. Furthermore, the concentrations of steviol glycosides producedby recombinant hosts are expected to be higher than the levels ofsteviol glycosides produced in the Stevia plant, which improves theefficiency of the downstream purification. Such sweetener compositionscontain little or no plant based contaminants, relative to the amount ofcontaminants present in Stevia extracts.

At least one of the genes is a recombinant gene, the particularrecombinant gene(s) depending on the species or strain selected for use.Additional genes or biosynthetic modules can be included in order toincrease steviol glycoside yield, improve efficiency with which energyand carbon sources are converted to steviol and its glycosides, and/orto enhance productivity from the cell culture or plant. Such additionalbiosynthetic modules include genes involved in the synthesis of theterpenoid precursors, isopentenyl diphosphate and dimethylallyldiphosphate. Additional biosynthetic modules include terpene synthaseand terpene cyclase genes, such as genes encoding geranylgeranyldiphosphate synthase and copalyl diphosphate synthase; these genes maybe endogenous genes or recombinant genes.

I. STEVIOL AND STEVIOL GLYCOSIDE BIOSYNTHESIS POLYPEPTIDES A. SteviolBiosynthesis Polypeptides

Chemical structures for several of the compounds found in Steviaextracts are shown in FIG. 1, including the diterpene steviol andvarious steviol glycosides. CAS numbers are shown in Table A below. Seealso, Steviol Glycosides Chemical and Technical Assessment 69th JECFA,prepared by Harriet Wallin, Food Agric. Org. (2007).

TABLE A COMPOUND CAS # Steviol 471-80-7 Rebaudioside A 58543-16-1Steviolbioside 41093-60-1 Stevioside 57817-89-7 Rebaudioside B58543-17-2 Rebaudioside C 63550-99-2 Rebaudioside D 63279-13-0Rebaudioside E 63279-14-1 Rebaudioside F 438045-89-7 Rubusoside63849-39-4 Dulcoside A 64432-06-0

It has been discovered that expression of certain genes in a host suchas a microorganism confers the ability to synthesize steviol glycosidesupon that host. As discussed in more detail below, one or more of suchgenes may be present naturally in a host. Typically, however, one ormore of such genes are recombinant genes that have been transformed intoa host that does not naturally possess them.

The biochemical pathway to produce steviol involves formation ofgeranylgeranyl diphosphate, cyclization to (-) copalyl diphosphate,followed by oxidation and hydroxylation to form steviol. Thus,conversion of geranylgeranyl diphosphate to steviol in a recombinantmicroorganism involves the expression of a gene encoding a kaurenesynthase (KS), a gene encoding a kaurene oxidase (KO), and a geneencoding a steviol synthetase (KAH). Steviol synthetase also is known askaurenoic acid 13-hydroxylase.

Suitable KS polypeptides are known. For example, suitable KS enzymesinclude those made by Stevia rebaudiana, Zea mays, Populus trichocarpa,and Arabidopsis thaliana. Sec, Table 1 and SEQ ID NOs: 132-135 and 156.Nucleotide sequences encoding these polypeptides are set forth in SEQ IDNOs: 40-47 and 155. The nucleotide sequences set forth in SEQ IDNOs:40-43 were modified for expression in yeast while the nucleotidesequences set forth in SEQ ID NOs: 44-47 are from the source organismsfrom which the KS polypeptides were identified.

TABLE 1 KS Clones Enzyme SEQ Source Accession Construct Length ID SEQ IDOrganism gi Number Number Name (nts) (DNA) (protein) Stevia 4959241AAD34295 MM-12 2355 40 132 rebaudiana Stevia 4959239 AAD34294 MM-13 235541 133 rebaudiana Zea mays 162458963 NP_001105097 MM-14 1773 42 134Populus 224098838 XP_002311286 MM-15 2232 43 135 trichocarpa Arabidopsis3056724 AF034774 EV-70 2358 155 156 thaliana

Suitable KO polypeptides are known. For example, suitable KO enzymesinclude those made by Stevia rebaudiana, Arabidopsis thaliana,Gibberella fujikoroi and Trametes versicolor. See, Table 2 and SEQ IDNOs: 138-141. Nucleotide sequences encoding these polypeptides are setforth in in SEQ ID NOs: 52-59. The nucleotide sequences set forth in SEQID NOs: 52-55 were modified for expression in yeast. The nucleotidesequences set forth in SEQ ID NOs: 56-59 are from the source organismsfrom which the KO polypeptides were identified.

TABLE 2 KO Clones Enzyme SEQ Source gi Accession Construct ID SEQ IDOrganism Number Number Name Length (nts) (DNA) (protein) Stevia 76446107ABA42921 MM-18 1542 52 138 rebaudiana Arabidopsis 3342249 AAC39505 MM-191530 53 139 thaliana Gibberella 4127832 CAA76703 MM-20 1578 54 140fujikoroi Trametes 14278967 BAB59027 MM-21 1500 55 141 versicolor

Suitable KAH polypeptides are known. For example, suitable KAH enzymesinclude those made by Stevia rebaudiana, Arabidopsis thaliana, Vitisvinifera and Medicago trunculata. See, e.g., Table 3, SEQ ID NOs:142-146; U.S. Patent Publication No. 2008-0271205; U.S. PatentPublication No. 2008-0064063 and Genbank Accession No. gi 189098312. Thesteviol synthetase from Arabidopsis thaliana is classified as aCYP714A2. Nucleotide sequences encoding these KAH enzymes are set forthin SEQ ID NOs: 60-69. The nucleotide sequences set forth in SEQ ID NOs:60-64 were modified for expression in yeast while the nucleotidesequences from the source organisms from which the polypeptides wereidentified are set forth in SEQ ID NOs: 65-69.

TABLE 3 KAH Clones Enzyme SEQ Source Accession Plasmid Construct LengthID SEQ ID Organism gi Number Number Name Name (nts) (DNA) (protein)Stevia —* pMUS35 MM-22 1578 60 142 rebaudiana Stevia 189418962 ACD93722pMUS36 MM-23 1431 61 143 rebaudiana Arabidopsis 15238644 NP_197872pMUS37 MM-24 1578 62 144 thaliana Vitis 225458454 XP_002282091 pMUS38MM-25 1590 63 145 Vinifera Medicago 84514135 ABC59076 pMUS39 MM-26 144064 146 trunculata *= Sequence is shown in U.S. Patent Publication No.2008-0064063.

In addition, a KAH polypeptide from Stevia rebaudiana that wasidentified herein is particularly useful in a recombinant host. Thenucleotide sequence (SEQ ID NO: 163) encoding the S. rebaudiana KAH(SrKAHe1) (SEQ ID NO:164) is set forth in FIG. 12A. A nucleotidesequence encoding the S. rebaudiana KAH that has been codon-optimizedfor expression in yeast (SEQ ID NO:165) is set forth in FIG. 12B. Theamino acid sequence of the S. rebaudiana KAH is set forth in FIG. 12C.The S. rebaudiana KAH shows significantly higher steviol synthaseactivity as compared to the Arabidopsis thaliana ent-kaurenoic acidhydroxylase described by Yamaguchi et al. (U.S. Patent Publication No.2008/0271205 A1) when expressed in S. cerevisiae. The S. rebaudiana KAHpolypeptide set forth in FIG. 12C has less than 20% identity to the KAHfrom U.S. Patent Publication No. 2008/0271205, and less than 35%identity to the KAH from U.S. Patent Publication No. 2008/0064063.

In some embodiments, a recombinant microorganism contains a recombinantgene encoding a KO and/or a KAII polypeptide. Such microorganisms alsotypically contain a recombinant gene encoding a cytochrome P450reductase (CPR) polypeptide, since certain combinations of KO and/or KAHpolypeptides require expression of an exogenous CPR polypeptide. Inparticular, the activity of a KO and/or a KAH polypeptide of plantorigin can be significantly increased by the inclusion of a recombinantgene encoding an exogenous CPR polypeptide. Suitable CPR polypeptidesare known. For example, suitable CPR enzymes include those made byStevia rebaudiana and Arabidopsis thaliana. See, e.g., Table 4 and SEQID NOs: 147 and 148. Nucleotide sequences encoding these polypeptidesare set forth in SEQ ID NOs: 70, 71, 73, and 74. The nucleotidesequences set forth in SEQ ID NOs: 70-72 were modified for expression inyeast. The nucleotide sequences from the source organisms from which thepolypeptides were identified are set forth in SEQ ID NOs:73-75.

TABLE 4 CPR Clones Enzyme SEQ Source gi Accession Plasmid ConstructLength ID SEQ ID Organism Number Number Name Name (nts) (DNA) (protein)Stevia 93211213 ABB88839 pMUS40 MM-27 2133 70 147 rebaudiana Arabidopsis15233853 NP_194183 pMUS41 MM-28 2079 71 148 thaliana Giberella 32562989CAE09055 pMUS42 MM-29 2142 72 149 fujikuroi

For example, the steviol synthase encoded by SrKAHe1 is activated by theS. cerevisiae CPR encoded by gene NCP1 (YHR042W). Even better activationof the steviol synthase encoded by SrKAHe1 is observed when theArabidopsis thaliana CPR encoded by the gene ATR2 or the S. rebaudianaCPR encoded by the gene CPR8 are co-expressed. FIG. 13A contains theamino acid sequence of the S. cerevisiae, A. thaliana (from ATR1 andATR2 genes) and S. rebaudiana CPR polypeptides (from CPR7 and CPR8genes) (SEQ ID NOs: 166-170). FIG. 13 B contains the nucleotide sequenceencoding the A. thaliana and S. rebaudiana CPR polypeptides (SEQ IDNOs:171-174).

For example, the yeast gene DPP1 and/or the yeast gene LPP1 can bedisrupted or deleted such that the degradation of farnesyl pyrophosphate(FPP) to farnesol is reduced and the degradation ofgeranylgeranylpyrophosphate (GGPP)) to geranylgeraniol (GGOH) isreduced. Alternatively, the promoter or enhancer elements of anendogenous gene encoding a phosphatase can be altered such that theexpression of their encoded proteins is altered. Homologousrecombination can be used to disrupt an endogenous gene. For example, a“gene replacement” vector can be constructed in such a way to include aselectable marker gene. The selectable marker gene can be operablylinked, at both 5′ and 3′ end, to portions of the gene of sufficientlength to mediate homologous recombination. The selectable marker can beone of any number of genes that complement host cell auxotrophy, provideantibiotic resistance, or result in a color change. Linearized DNAfragments of the gene replacement vector then are introduced into thecells using methods well known in the art (see below). Integration ofthe linear fragments into the genome and the disruption of the gene canbe determined based on the selection marker and can be verified by, forexample, Southern blot analysis. Subsequent to its use in selection, aselectable marker can be removed from the genome of the host cell by,e.g., Cre-loxP systems (see, e.g., Gossen et al. (2002) Ann. Rev.Genetics 36:153-173 and U.S. Application Publication No. 20060014264).Alternatively, a gene replacement vector can be constructed in such away as to include a portion of the gene to be disrupted, where theportion is devoid of any endogenous gene promoter sequence and encodesnone, or an inactive fragment of, the coding sequence of the gene. An“inactive fragment” is a fragment of the gene that encodes a proteinhaving, e.g., less than about 10% (e.g., less than about 9%, less thanabout 8%, less than about 7%, less than about 6%, less than about 5%,less than about 4%, less than about 3%, less than about 2%, less thanabout 1%, or 0%) of the activity of the protein produced from thefull-length coding sequence of the gene. Such a portion of the gene isinserted in a vector in such a way that no known promoter sequence isoperably linked to the gene sequence, but that a stop codon and atranscription termination sequence are operably linked to the portion ofthe gene sequence. This vector can be subsequently linearized in theportion of the gene sequence and transformed into a cell. By way ofsingle homologous recombination, this linearized vector is thenintegrated in the endogenous counterpart of the gene.

Expression in a recombinant microorganism of these genes results in theconversion of geranylgeranyl diphosphate to steviol.

B. Steviol Glycoside Biosynthesis Polypeptides

A recombinant host described herein can convert steviol to a steviolglycoside. Such a host (e.g., microorganism) contains genes encoding oneor more UDP Glycosyl Transferases, also known as UGTs. UGTs transfer amonosaccharide unit from an activated nucleotide sugar to an acceptormoiety, in this case, an —OH or —COOH moiety on steviol or steviolderivative. UGTs have been classified into families and subfamiliesbased on sequence homology. Li et al. J. Biol. Chem. 276:4338-4343(2001).

B. 1 Rubusoside Biosynthesis Polypeptides

The biosynthesis of rubusoside involves glycosylation of the 13-OH andthe 19-COOH of steviol. Sec FIG. 2A. Conversion of steviol to rubusosidein a recombinant host such as a microorganism can be accomplished by theexpression of gene(s) encoding UGTs 85C2 and 74G1, which transfer aglucose unit to the 13-OH or the 19-COOH, respectively, of steviol.

A suitable UGT85C2 functions as a uridine 5′-diphospho glucosyl:steviol13-OH transferase, and a uridine 5′-diphosphoglucosyl:steviol-19-O-glucoside 13-OH transferase. Functional UGT85C2polypeptides also may catalyze glucosyl transferase reactions thatutilize steviol glycoside substrates other than steviol andsteviol-19-O-glucoside.

A suitable UGT74G1 polypeptide functions as a uridine 5′-diphosphoglucosyl: steviol 19-COOH transferase and a uridine 5′-diphosphoglucosyl: steviol-13-O-glucoside 19-COOH transferase. Functional UGT74G1polypeptides also may catalyze glycosyl transferase reactions thatutilize steviol glycoside substrates other than steviol andsteviol-13-O-glucoside, or that transfer sugar moieties from donorsother than uridine diphosphate glucose.

A recombinant microorganism expressing a functional UGT74G1 and afunctional UGT85C2 can make rubusoside and both steviol monosides (i.e.,steviol 13-O-monoglucoside and steviol 19-O-monoglucoside) when steviolis used as a feedstock in the medium. One or more of such genes may bepresent naturally in the host. Typically, however, such genes arerecombinant genes that have been transformed into a host (e.g.,microorganism) that does not naturally possess them.

As used herein, the term recombinant host is intended to refer to ahost, the genome of which has been augmented by at least oneincorporated DNA sequence. Such DNA sequences include but are notlimited to genes that are not naturally present, DNA sequences that arenot normally transcribed into RNA or translated into a protein(“expressed”), and other genes or DNA sequences which one desires tointroduce into the non-recombinant host. It will be appreciated thattypically the genome of a recombinant host described herein is augmentedthrough the stable introduction of one or more recombinant genes.Generally, the introduced DNA is not originally resident in the hostthat is the recipient of the DNA, but it is within the scope of theinvention to isolate a DNA segment from a given host, and tosubsequently introduce one or more additional copies of that DNA intothe same host, e.g., to enhance production of the product of a gene oralter the expression pattern of a gene. In some instances, theintroduced DNA will modify or even replace an endogenous gene or DNAsequence by, e.g., homologous recombination or site-directedmutagenesis. Suitable recombinant hosts include microorganisms, plantcells, and plants.

The term “recombinant gene” refers to a gene or DNA sequence that isintroduced into a recipient host, regardless of whether the same or asimilar gene or DNA sequence may already be present in such a host.“Introduced,” or “augmented” in this context, is known in the art tomean introduced or augmented by the hand of man. Thus, a recombinantgene may be a DNA sequence from another species, or may be a DNAsequence that originated from or is present in the same species, but hasbeen incorporated into a host by recombinant methods to form arecombinant host. It will be appreciated that a recombinant gene that isintroduced into a host can be identical to a DNA sequence that isnormally present in the host being transformed, and is introduced toprovide one or more additional copies of the DNA to thereby permitoverexpression or modified expression of the gene product of that DNA.

Suitable UGT74G1 and UGT85C2 polypeptides include those made by Steviarebaudiana. Genes encoding functional UGT74G1 and UGT85C2 polypeptidesfrom Stevia are reported in Richman, et al. Plant J. 41: 56-67 (2005).Amino acid sequences of S. rebaudiana UGT74G1 and UGT85C2 polypeptidesare set forth in SEQ ID NOs: 1 and 3, respectively. Nucleotide sequencesencoding UGT74G1 and UGT85C2 that have been optimized for expression inyeast are set forth in SEQ ID NOs: 2 and 4, respectively. DNA 2.0codon-optimized sequence for UGTs 85C2, 91D2e, 74G1 and 76G1 are setforth in SEQ ID NOs: 82, 84, 83, and 85, respectively. See also theUGT85C2 and UGT74G1 variants described below in the “Functional Homolog”section. For example, an UGT85C2 polypeptide containing substitutions atpositions 65, 71, 270, 289, and 389 can be used (e.g., A65S, E71Q,T270M, Q289H, and A389V).

In some embodiments, the recombinant host is a microorganism. Therecombinant microorganism can be grown on media containing steviol inorder to produce rubusoside. In other embodiments, however, therecombinant microorganism expresses one or more recombinant genesinvolved in steviol biosynthesis, e.g., a CDPS gene, a KS gene, a KOgene and/or a KAH gene. Suitable CDPS polypeptides are known. Forexample, suitable CDPS enzymes include those made by Stevia rebaudiana,Streptomyces clavuligerus, Bradyrhizobium japonicum, Zea mays, andArabidopsis. See, e.g., Table 5 and SEQ ID NOs: 129-131, 158, and 160.Nucleotide sequences encoding these polypeptides are set forth in SEQ IDNOs: 34-39, 157, and 159. The nucleotide sequences set forth in SEQ IDNOs: 34-36 were modified for expression in yeast. The nucleotidesequences from the source organisms from which the polypeptides wereidentified are set forth in SEQ ID NOs:37-39.

In some embodiments, CDPS polypeptides that lack a chloroplast transitpeptide at the amino terminus of the unmodified polypeptide can be used.For example, the first 150 nucleotides from the 5′ end of the Zea maysCDPS coding sequence shown in FIG. 14 (SEQ ID NO:157) can be removed.Doing so removes the amino terminal 50 residues of the amino acidsequence shown in FIG. 14 (SEQ ID NO: 158), which encode a chloroplasttransit peptide. The truncated CDPS gene can be fitted with a new ATGtranslation start site and operably linked to a promoter, typically aconstitutive or highly expressing promoter. When a plurality of copiesof the truncated coding sequence are introduced into a microorganism,expression of the CDPS polypeptide from the promoter results in anincreased carbon flux towards ent-kaurene biosynthesis.

TABLE 5 CDPS Clones SEQ Enzyme Source Accession Plasmid Construct LengthID: SEQ ID Organism gi Number Number Name Name (nts) (DNA) (protein)Stevia 2642661 AAB87091 pMUS22 MM-9 2364 34 129 rebaudiana Streptomyces197705855 EDY51667 pMUS23 MM-10 1584 35 130 clavuligerus Bradyrhizobium529968 AAC28895.1 pMUS24 MM-11 1551 36 131 japonicum Zea mays 50082774AY562490 EV65 2484 157 158 Arabidopsis 18412041 NM_116512 EV64 2409 159160 thaliana

CDPS-KS bifunctional proteins (SEQ ID NOs: 136 and 137) also can beused. Nucleotide sequences encoding the CDPS-KS bifunctional enzymesshown in Table 6 were modified for expression in yeast (see SEQ ID NOs:48 and 49). The nucleotide sequences from the source organisms fromwhich the polypeptides were originally identified are set forth in SEQID NOs: 50 and 51. A bifunctional enzyme from Gibberella fujikuroi (SEQID NO:162) also can be used. A nucleotide sequence encoding theGibberella fujikuroi bifunctional CDPS-KS enzyme was modified forexpression in yeast (see FIG. 15A, SEQ ID NO:161).

TABLE 6 CDPS-KS Clones Enzyme Source Accession Construct Length SEQ IDSEQ ID Organism gi Number Number Name (nts) (DNA) (protein) Phomopsis186704306 BAG30962 MM-16 2952 48 136 amygdali Physcomitrella 146325986BAF61135 MM-17 2646 49 137 patens Gibberella 62900107 Q9UVY5.1 2859 161162 fujikuroi

Thus, a microorganism containing a CDPS gene, a KS gene, a KO gene and aKAH gene in addition to a UGT74G1 and a UGT85C2 gene is capable ofproducing both steviol monosides and rubusoside without the necessityfor using steviol as a feedstock.

In some embodiments, the recombinant microorganism further expresses arecombinant gene encoding a geranylgeranyl diphosphate synthase (GGPPS).Suitable GGPPS polypeptides are known. For example, suitable GGPPSenzymes include those made by Stevia rebaudiana, Gibberella fujikuroi,Mus musculus, Thalassiosira pseudonana, Streptomyces clavuligerus,Sulfulobus acidocaldarius, Synechococcus sp. and Arabidopsis thaliana.See, Table 7 and SEQ ID NOs: 121-128. Nucleotide sequences encodingthese polypeptides are set forth in SEQ ID NOs:18-33. The nucleotidesequences set forth in SEQ ID NOs: 18-25 were modified for expression inyeast while the nucleotide sequences from the source organisms fromwhich the polypeptides were identified are set forth in SEQ ID NOs:26-33.

TABLE 7 GGPPS Clones Enzyme SEQ Source Accession Plasmid ConstructLength ID SEQ ID Organism gi Number Number Name Name (nts) (DNA)(protein) Stevia 90289577 ABD92926 pMUS14 MM-1 1086 18 121 rebaudianaGibberella 3549881 CAA75568 pMUS15 MM-2 1029 19 122 fujikuroi Musmusculus 47124116 AAH69913 pMUS16 MM-3 903 20 123 Thalassiosira223997332 XP_002288339 pMUS17 MM-4 1020 21 124 pseudonana Streptomyces254389342 ZP_05004570 pMUS18 MM-5 1068 22 125 clavuligerus Sulfulobus506371 BAA43200 pMUS19 MM-6 993 23 126 acidocaldarius Synechococcus86553638 ABC98596 pMUS20 MM-7 894 24 127 sp. Arabidopsis 15234534NP_195399 pMUS21 MM-8 1113 25 128 thaliana

In some embodiments, the recombinant microorganism further can expressrecombinant genes involved in diterpene biosynthesis or production ofterpenoid precursors, e.g., genes in the methylerythritol 4-phosphate(MEP) pathway or genes in the mevalonate (MEV) pathway discussed below,have reduced phosphatase activity, and/or express a sucrose synthase(SUS) as discussed herein.

B. 2 Rebaudioside a, Rebaudioside D, and Rebaudioside E BiosynthesisPolypeptides

The biosynthesis of rebaudioside A involves glucosylation of theaglycone steviol. Specifically, rebaudioside A can be formed byglucosylation of the 13-OH of steviol which forms the13-O-steviolmonoside, glucosylation of the C-2′ of the 13-O-glucose ofsteviolmonoside which forms steviol-1,2-bioside, glucosylation of theC-19 carboxyl of steviol-1,2-bioside which forms stevioside, andglucosylation of the C-3′ of the C-13-O-glucose of stevioside. The orderin which each glucosylation reaction occurs can vary. See FIG. 2A.

The biosynthesis of rebaudioside E and/or rebaudioside D involvesglucosylation of the aglycone steviol. Specifically, rebaudioside E canbe formed by glucosylation of the 13-OH of steviol which formssteviol-13-O-glucoside, glucosylation of the C-2′ of the 13-O-glucose ofsteviol-13-O-glucoside which forms the steviol-1,2-bioside,glucosylation of the C-19 carboxyl of the 1,2-bioside to form1,2-stevioside, and glucosylation of the C-2′ of the 19-O-glucose of the1,2-stevioside to form rebaudioside E. Rebaudioside D can be formed byglucosylation of the C-3′ of the C-13-O-glucose of rebaudioside E. Theorder in which each glycosylation reaction occurs can vary. For example,the glucosylation of the C-2′ of the 19-O-glucose may be the last stepin the pathway, wherein Rebaudioside A is an intermediate in thepathway. See FIG. 2C.

It has been discovered that conversion of steviol to rebaudioside A,rebaudioside D, and/or rebaudioside E in a recombinant host can beaccomplished by expressing the following functional UGTs: EUGT11, 74G1,85C2, and 76G1, and optionally 91D2. Thus, a recombinant microorganismexpressing combinations of these four or five UGTs can make rebaudiosideA and rebaudioside D when steviol is used as a feedstock. Typically, oneor more of these genes are recombinant genes that have been transformedinto a microorganism that does not naturally possess them. It has alsobeen discovered that UGTs designated herein as SM12UGT can besubstituted for UGT91D2.

In some embodiments, less than five (e.g., one, two, three, or four)UGTs are expressed in a host. For example, a recombinant microorganismexpressing a functional EUGT11 can make rebaudioside D when rebaudiosideA is used as a feedstock. A recombinant microorganism expressing twofunctional UGTs, EUGT11 and 76G1, and optionally a functional 91D12, canmake rebaudioside D when rubusoside or 1,2-stevioside is used as afeedstock. As another alternative, a recombinant microorganismexpressing three functional UGTs, EUGT11, 74G1, 76G1, and optionally91D2, can make rebaudioside D when fed the monoside,steviol-13-O-glucoside, in the medium. Similarly, conversion ofsteviol-19-O-glucoside to rebaudioside D in a recombinant microorganismcan be accomplished by the expression of genes encoding UGTs EUGT11,85C2, 76G1, and optionally 91D2, when fed steviol-19-O-glucoside.Typically, one or more of these genes are recombinant genes that havebeen transformed into a host that does not naturally possess them.

Suitable UGT74G1 and UGT85C2 polypeptides include those discussed above.A suitable UGT76G1 adds a glucose moiety to the C-3′ of theC-13-O-glucose of the acceptor molecule, a steviol 1,2 glycoside. Thus,UGT76G1 functions, for example, as a uridine 5′-diphospho glucosyl:steviol 13-O-1,2 glucoside C-3′ glucosyl transferase and a uridine5′-diphospho glucosyl: steviol-19-O-glucose, 13-O-1,2 bioside C-3′glucosyl transferase. Functional UGT76G1 polypeptides may also catalyzeglucosyl transferase reactions that utilize steviol glycoside substratesthat contain sugars other than glucose, e.g., steviol rhamnosides andsteviol xylosides. See, FIGS. 2A, 2B, 2C and 2D. Suitable UGT76G1polypeptides include those made by S. rebaudiana and reported inRichman, et al. Plant J. 41: 56-67 (2005). The amino acid sequence of aS. rebaudiana UGT76G1 polypeptide is set forth in SEQ ID NO:7. Thenucleotide sequence encoding the UGT76G1 polypeptide of SEQ ID NO:7 hasbeen optimized for expression in yeast and is set forth in SEQ ID NO:8.See also the UGT76G1 variants set forth in the “Functional Homolog”section.

A suitable EUGT11 or UGT91D2 polypeptide functions as a uridine5′-diphospho glucosyl: steviol-13-O-glucoside transferase (also referredto as a steviol-13-monoglucoside 1,2-glucosylase), transferring aglucose moiety to the C-2′ of the 13-O-glucose of the acceptor molecule,steviol-13-O-glucoside.

A suitable EUGT11 or UGT91D2 polypeptide also functions as a uridine5′-diphospho glucosyl: rubusoside transferase transferring a glucosemoiety to the C-2′ of the 13-O-glucose of the acceptor molecule,rubusoside, to produce stevioside. EUGT11 polypeptides also can transfera glucose moiety to the C-2′ of the 19-O-glucose of the acceptormolecule, rubusoside, to produce a 19-O-1,2-diglycosylated rubusoside(compound 2 in FIG. 3).

Functional EUGT11 or UGT91D2 polypeptides also can catalyze reactionsthat utilize steviol glycoside substrates other thansteviol-13-O-glucoside and rubusoside. For example, a functional EUGT11polypeptide may utilize stevioside as a substrate, transferring aglucose moiety to the C-2′ of the 19-O-glucose residue to produceRebaudioside E (see compound 3 in FIG. 3). Functional EUGT11 and UGT91D2polypeptides may also utilize Rebaudioside A as a substrate,transferring a glucose moiety to the C-2′ of the 19-O-glucose residue ofRebaudioside A to produce Rebaudioside D. As set forth in the Examples,EUGT11 can convert Rebaudioside A to Rebaudioside D at a rate that isleast 20 times faster (e.g., as least 25 times or at least 30 timesfaster) than the corresponding rate of UGT91D2e (SEQ ID NO: 5) when thereactions are performed under similar conditions, i.e., similar time,temperature, purity, and substrate concentration. As such, EUGT11produces greater amounts of RebD than UGT91D2e when incubated undersimilar conditions.

In addition, a functional EUGT11 exhibits significant C-2′19-O-diglycosylation activity with rubusoside or stevioside assubstrates, whereas UGT91D2e has no detectable diglycosylation activitywith these substrates. Thus, a functional EUGT11 can be distinguishedfrom UGT91D2e by the differences in steviol glycosidesubstrate-specificity. FIG. 3 provides a schematic overview of the19-O-1,2 diglycosylation reactions that are performed by EUGT11 andUGT91D2e.

A functional EUGT11 or UGT91D2 polypeptide typically does not transfer aglucose moiety to steviol compounds having a 1,3-bound glucose at theC-13 position, i.e., transfer of a glucose moiety to steviol 1,3-biosideand 1,3-stevioside does not occur.

Functional EUGT11 and UGT91D2 polypeptides can transfer sugar moietiesfrom donors other than uridine diphosphate glucose. For example, afunctional EUGT11 or UGT91D2 polypeptide can act as a uridine5′-diphospho D-xylosyl: steviol-13-O-glucoside transferase, transferringa xylose moiety to the C-2′ of the 13-O-glucose of the acceptormolecule, steviol-13-O-glucoside. As another example, a functionalEUGT11 or UGT91D2 polypeptide can act as a uridine 5′-diphosphoL-rhamnosyl: steviol-13-O-glucoside transferase, transferring a rhamnosemoiety to the C-2′ of the 13-O-glucose of the acceptor molecule,steviol-13-O-glucoside

Suitable EUGT11 polypeptides are described herein and can include theEUGT11 polypeptide from Oryza sativa (GenBank Accession No. AC133334).For example, an EUGT11 polypeptide can have an amino acid sequence withat least 70% sequence identity (e.g., at least 75, 80, 85, 90, 95, 96,97, 98, or 99% sequence identity) to the amino acid sequence set forthin SEQ ID NO: 152 (see FIG. 7). The nucleotide sequence encoding theamino acid sequence of SEQ ID NO: 152 is set forth in SEQ ID NO: 153.SEQ ID NO: 154 is a nucleotide sequence encoding the polypeptide of SEQID NO: 152 that has been codon optimized for expression in yeast.

Suitable functional UGT91D2 polypeptides include those disclosed herein,e.g., the polypeptides designated UGT91D2e and UGT91D2m. The amino acidsequence of an exemplary UGT91D2e polypeptide from Stevia rebaudiana isset forth in SEQ ID NO: 5. SEQ ID NO:6 is a nucleotide sequence encodingthe polypeptide of SEQ ID NO:5 that has been codon optimized forexpression in yeast. The S. rebaudiana nucleotide sequence encoding thepolypeptide of SEQ ID NO:5 is set forth in SEQ ID NO:9. The amino acidsequences of exemplary UGT91D2m polypeptides from S. rebaudiana are setforth in SEQ ID NOs: 10 and 12, and are encoded by the nucleic acidsequences set forth in SEQ ID NOs: 11 and 13, respectively. In addition,UGT91D2 variants containing a substitution at amino acid residues 206,207, and 343 of SEQ ID NO: 5 can be used. For example, the amino acidsequence set forth in SEQ ID NO:95 and having the following mutationswith respect to wild-type UGT92D2c (SEQ ID NO:5) G206R, Y207C, and W343Rcan be used. In addition, a UGT91D2 variant containing substitutions atamino acid residues 211 and 286 can be used. For example, a UGT91D2variant can include a substitution of a methionine for leucine atposition 211 and a substitution of an alanine for valine at position 286of SEQ ID NO:5 (UGT91D2e-b).

As indicated above, UGTs designated herein as SM12UGT can be substitutedfor UGT91D2. Suitable functional SM12UGT polypeptides include those madeby Ipomoea purpurea (Japanese morning glory) and described in Morita etal. Plant J. 42, 353-363 (2005). The amino acid sequence encoding the I.purpurea IP3GGT polypeptide is set forth in SEQ ID NO:76. SEQ ID NO:77is a nucleotide sequence encoding the polypeptide of SEQ ID NO:76 thathas been codon optimized for expression in yeast. Another suitableSM12UGT polypeptide is a Bp94B1 polypeptide having an R25S mutation. SeeOsmani et al. Plant Phys. 148: 1295-1308 (2008) and Sawada et al. J.Biol. Chem. 280:899-906 (2005). The amino acid sequence of the Bellisperennis (red daisy) UGT94B1 polypeptide is set forth in SEQ ID NO:78.SEQ ID NO:79 is the nucleotide sequence encoding the polypeptide of SEQID NO:78 that has been codon optimized for expression in yeast.

In some embodiments, the recombinant microorganism is grown on mediacontaining steviol-13-O-glucoside or steviol-19-O-glucoside in order toproduce rebaudioside A and/or rebaudioside D. In such embodiments, themicroorganism contains and expresses genes encoding a functional EUGT11,a functional UGT74G1, a functional UGT85C2, a functional UGT76G1, and anoptional functional UGT91D2, and is capable of accumulating rebaudiosideA and rebaudioside D when steviol, one or both of the steviolmonosides,or rubusoside is used as feedstock.

In other embodiments, the recombinant microorganism is grown on mediacontaining rubusoside in order to produce rebaudioside A and/orrebaudioside D. In such embodiments, the microorganism contains andexpresses genes encoding a functional EUGT11, a functional UGT76G1, andan optional functional UGT91D2, and is capable of producing rebaudiosideA and/or rebaudioside D when rubusoside is used as feedstock.

In other embodiments the recombinant microorganism expresses one or moregenes involved in steviol biosynthesis, e.g., a CDPS gene, a KS gene, aKO gene and/or a KAH gene. Thus, for example, a microorganism containinga CDPS gone, a KS gene, a KO gene and a KAH gene, in addition to aEUGT11, a UGT74G1, a UGT85C2, a UGT76G1, and optionally a functionalUGT91D2 (e.g., UGT91D2e), is capable of producing rebaudioside A,rebaudioside D, and/or rebaudioside E without the necessity forincluding steviol in the culture media.

In some embodiments, the recombinant host further contains and expressesa recombinant GGPPS gene in order to provide increased levels of thediterpene precursor geranylgeranyl diphosphate, for increased fluxthrough the steviol biosynthetic pathway. In some embodiments, therecombinant host further contains a construct to silence the expressionof non-steviol pathways consuming geranylgeranyl diphosphate,ent-Kaurenoic acid or farnesyl pyrophosphate, thereby providingincreased flux through the steviol and steviol glycosides biosyntheticpathways. For example, flux to sterol production pathways such asergosterol may be reduced by downregulation of the ERG9 gene. See, theERG9 section below and Examples 24-25. In cells that producegibberellins, gibberellin synthesis may be downregulated to increaseflux of ent-kaurenoic acid to steviol. In carotenoid-producingorganisms, flux to steviol may be increased by downregulation of one ormore carotenoid biosynthetic genes. In some embodiments, the recombinantmicroorganism further can express recombinant genes involved inditerpene biosynthesis or production of terpenoid precursors, e.g.,genes in the MEP or MEV) pathways discussed below, have reducedphosphatase activity, and/or express a SUS as discussed herein.

One with skill in the art will recognize that by modulating relativeexpression levels of different UGT genes, a recombinant host can betailored to specifically produce steviol glycoside products in a desiredproportion. Transcriptional regulation of steviol biosynthesis genes andsteviol glycoside biosynthesis genes can be achieved by a combination oftranscriptional activation and repression using techniques known tothose in the art. For in vitro reactions, one with skill in the art willrecognize that addition of different levels of UGT enzymes incombination or under conditions which impact the relative activities ofthe different UGTS in combination will direct synthesis towards adesired proportion of each steviol glycoside. One with skill in the artwill recognize that a higher proportion of rebaudioside D or E or moreefficient conversion to rebaudioside D or E can be obtained with adiglycosylation enzyme that has a higher activity for the 19-O-glucosidereaction as compared to the 13-O-glucoside reaction (substratesrebaudioside A and stevioside).

In some embodiments, a recombinant host such as a microorganism producesrebaudioside D-enriched steviol glycoside compositions that have greaterthan at least 3% rebaudioside D by weight total steviol glycosides,e.g., at least 4% rebaudioside D at least 5% rebaudioside D, 10-20%rebaudioside D, 20-30% rebaudioside D, 30-40% rebaudioside D, 40-50%rebaudioside D, 50-60% rebaudioside D, 60-70% rebaudioside D, 70-80%rebaudioside D. In some embodiments, a recombinant host such as amicroorganism produces steviol glycoside compositions that have at least90% rebaudioside D, e.g., 90-99% rebaudioside D. Other steviolglycosides present may include those depicted in FIG. 2 C such assteviol monosides, steviol glucobiosides, rebaudioside A, rebaudiosideE, and stevioside. In some embodiments, the rebaudioside D-enrichedcomposition produced by the host (e.g., microorganism) can be furtherpurified and the rebaudioside D or rebaudioside E so purified can thenbe mixed with other steviol glycosides, flavors, or sweeteners to obtaina desired flavor system or sweetening composition. For instance, arebaudioside D-enriched composition produced by a recombinant host canbe combined with a rebaudioside A, C, or F-enriched composition producedby a different recombinant host, with rebaudioside A, F, or C purifiedfrom a Stevia extract, or with rebaudioside A, F, or C produced invitro.

In some embodiments, rebaudioside A, rebaudioside D, rebaudioside B,steviol monoglucosides, steviol-1,2-bioside, rubusoside, stevioside, orrebaudioside E can be produced using in vitro methods while supplyingthe appropriate UDP-sugar and/or a cell-free system for regeneration ofUDP-sugars. See, for example, Jewett M C, et al. Molecular SystemsBiology, Vol. 4, article 220 (2008); Masada S et al. FEBS Letters, Vol.581, 2562-2566 (2007). In some embodiments, sucrose and a sucrosesynthase may be provided in the reaction vessel in order to regenerateUDP-glucose from the UDP generated during glycosylation reactions. SeeFIG. 11. The sucrose synthase can be from any suitable organism. Forexample, a sucrose synthase coding sequence from Arabidopsis thaliana,Stevia rebaudiana, or Coffea arabica can be cloned into an expressionplasmid under control of a suitable promoter, and expressed in a hostsuch as a microorganism or a plant.

Conversions requiring multiple reactions may be carried out together, orstepwise. For example, rebaudioside D may be produced from rebaudiosideA that is commercially available as an enriched extract or produced viabiosynthesis, with the addition of stoichiometric or excess amounts ofUDP-glucose and EUGT11. As an alternative, rebaudioside D may beproduced from steviol glycoside extracts that are enriched forstevioside and rebaudioside A, using EUGT11 and a suitable UGT76G1enzyme. In some embodiments, phosphatases are used to remove secondaryproducts and improve the reaction yields. UGTs and other enzymes for invitro reactions may be provided in soluble forms or in immobilizedforms.

In some embodiments, rebaudioside A, rebaudioside D, or rebaudioside Ecan be produced using whole cells that are fed raw materials thatcontain precursor molecules such as steviol and/or steviol glycosides,including mixtures of steviol glycosides derived from plant extracts.The raw materials may be fed during cell growth or after cell growth.The whole cells may be in suspension or immobilized. The whole cells maybe entrapped in beads, for example calcium or sodium alginate beads. Thewhole cells may be linked to a hollow fiber tube reactor system. Thewhole cells may be concentrated and entrapped within a membrane reactorsystem. The whole cells may be in fermentation broth or in a reactionbuffer. In some embodiments, a permeabilizing agent is utilized forefficient transfer of substrate into the cells. In some embodiments, thecells are permeabilized with a solvent such as toluene, or with adetergent such as Triton-X or Tween. In some embodiments, the cells arepermeabilized with a surfactant, for example a cationic surfactant suchas cetyltrimethylammonium bromide (CTAB). In some embodiments, the cellsare permeabilized with periodic mechanical shock such as electroporationor a slight osmotic shock. The cells can contain one recombinant UGT ormultiple recombinant UGTs. For example, the cells can contain UGT 76G1and EUGT11 such that mixtures of stevioside and RebA are efficientlyconverted to RebD. In some embodiments, the whole cells are the hostcells described in section III A. In some embodiments, the whole cellsare a Gram-negative bacterium such as E. coli. In some embodiments, thewhole cell is a Gram-positive bacterium such as Bacillus. In someembodiments, the whole cell is a fungal species such as Aspergillus, ora yeast such as Saccharomyces. In some embodiments, the term “whole cellbiocatalysis” is used to refer to the process in which the whole cellsare grown as described above (e.g., in a medium and optionallypermeabilized) and a substrate such as rebA or stevioside is providedand converted to the end product using the enzymes from the cells. Thecells may or may not be viable, and may or may not be growing during thebioconversion reactions. In contrast, in fermentation, the cells arecultured in a growth medium and fed a carbon and energy source such asglucose and the end product is produced with viable cells.

B. 3 Dulcoside a and Rebaudioside C Biosynthesis Polypeptides

The biosynthesis of rebaudioside C and/or dulcoside A involvesglucosylation and rhamnosylation of the aglycone steviol. Specifically,dulcoside A can be formed by glucosylation of the 13-OH of steviol whichforms steviol-13-O-glucoside, rhamnosylation of the C-2′ of the13-O-glucose of steviol-13-O-glucoside which forms the 1,2rhamnobioside, and glucosylation of the C-19 carboxyl of the 1,2rhamnobioside. Rebaudioside C can be formed by glucosylation of the C-3′of the C-13-O-glucose of dulcoside A. The order in which eachglycosylation reaction occurs can vary. See FIG. 2B.

It has been discovered that conversion of steviol to dulcoside A in arecombinant host can be accomplished by the expression of gene(s)encoding the following functional UGTs: 85C2, EUGT11 and/or 91D2e, and74G1. Thus, a recombinant microorganism expressing these three or fourUGTs and a rhamnose synthetase can make dulcoside A when fed steviol inthe medium. Alternatively, a recombinant microorganism expressing twoUGTs, EUGT11 and 74G1, and rhamnose synthetase can make dulcoside A whenfed the monoside, steviol-13-O-glucoside or steviol-19-O-glucoside, inthe medium. Similarly, conversion of steviol to rebaudioside C in arecombinant microorganism can be accomplished by the expression ofgene(s) encoding UGTs 85C2, EUGT11, 74G1, 76G1, optionally 91D2c, andrhamnose synthetase when fed steviol, by the expression of genesencoding UGTs EUGT11 and/or 91D2e, 74G1, and 76G1, and rhamnosesynthetase when fed steviol-13-O-glucoside, by the expression of genesencoding UGTs 85C2, EUGT11 and/or 91D2e, 76G1, and rhamnose synthetasewhen fed steviol-19-O-glucoside, or by the expression of genes encodingUGTs EUGT11 and/or 91D2e, 76G1, and rhamnose synthetase when fedrubusoside. Typically, one or more of these genes are recombinant genesthat have been transformed into a microorganism that does not naturallypossess them.

Suitable EUGT11, UGT91 D2, UGT74G1, UGT76G1 and UGT85C2 polypeptidesinclude the functional UGT polypeptides discussed herein. Rhamnosesynthetase provides increased amounts of the UDP-rhamnose donor forrhamnosylation of the steviol compound acceptor. Suitable rhamnosesynthetases include those made by Arabidopsis thaliana, such as theproduct of the A. thaliana RHM2 gene.

In some embodiments, a UGT79B3 polypeptide is substituted for a UGT91D2polypeptide. Suitable UGT79B3 polypeptides include those made byArabidopsis thaliana, which are capable of rhamnosylation of steviol13-O-monoside in vitro. A. thaliana UGT79B3 can rhamnosylateglycosylated compounds to form 1,2-rhamnosides. The amino acid sequenceof an Arabidopsis thaliana UGT79B3 is set forth in SEQ ID NO:150. Thenucleotide sequence encoding the amino acid sequence of SEQ ID NO:150 isset forth in SEQ ID NO:151.

In some embodiments rebaudioside C can be produced using in vitromethods while supplying the appropriate UDP-sugar and/or a cell-freesystem for regeneration of UDP-sugars. See, for example, “An integratedcell-free metabolic platform for protein production and syntheticbiology” by Jewett M C, Calhoun K A, Voloshin A, Wuu J J and Swartz J Rin Molecular Systems Biology, 4, article 220 (2008); Masada S et al.FEBS Letters, Vol. 581, 2562-2566 (2007). In some embodiments, sucroseand a sucrose synthase may be provided in the reaction vessel in orderto regenerate UDP-glucose from UDP during the glycosylation reactions.See FIG. 11. The sucrose synthase can be from any suitable organism. Forexample, a sucrose synthase coding sequence from Arabidopsis thaliana,Stevia rebaudiana, or Coffea arabica can be cloned into an expressionplasmid under control of a suitable promoter, and expressed in a host(e.g., a microorganism or a plant). In some embodiments a RHM2 enzyme(Rhamnose synthase) may also be provided, with NADPH, to generateUDP-rhamnose from UDP-glucose.

Reactions may be carried out together, or stepwise. For instance,rebaudioside C may be produced from rubusoside with the addition ofstoichiometric amounts of UDP-rhamnose and EUGT11, followed by additionof UGT76G1 and an excess or stoichiometric supply of UDP-glucose. Insome embodiments, phosphatases are used to remove secondary products andimprove the reaction yields. UGTs and other enzymes for in vitroreactions may be provided in soluble forms or immobilized forms. In someembodiments, rebaudioside C, Dulcoside A, or other steviol rhamnosidescan be produced using whole cells as discussed above. The cells cancontain one recombinant UGT or multiple recombinant UGTs. For example,the cells can contain UGT 76G1 and EUGT11 such that mixtures ofstevioside and RebA are efficiently converted to RebD. In someembodiments, the whole cells are the host cells described in section IIIA.

In other embodiments, the recombinant host expresses one or more genesinvolved in steviol biosynthesis, e.g., a CDPS gene, a KS gene, a KOgene and/or a KAH gene. Thus, for example, a microorganism containing aCDPS gene, a KS gene, a KO gene and a KAH gene, in addition to aUGT85C2, a UGT74G1, a EUGT11 gene, optionally a UGT91D2e gene, and aUGT76G1 gene, is capable of producing rebaudioside C without thenecessity for including steviol in the culture media. In addition, therecombinant host typically expresses an endogenous or a recombinant geneencoding a rhamnose synthetase. Such a gene is useful in order toprovide increased amounts of the UDP-rhamnose donor for rhamnosylationof the steviol compound acceptor. Suitable rhamnose synthetases includethose made by Arabidopsis thaliana, such as the product of the A.thaliana RHM2 gene.

One with skill in the art will recognize that by modulating relativeexpression levels of different UGT genes as well as modulating theavailability of UDP-rhamnose, a recombinant host can be tailored tospecifically produce steviol and steviol glycoside products in a desiredproportion. Transcriptional regulation of steviol biosynthesis genes,and steviol glycoside biosynthesis genes can be achieved by acombination of transcriptional activation and repression usingtechniques known to those in the art. For in vitro reactions, one withskill in the art will recognize that addition of different levels of UGTenzymes in combination or under conditions which impact the relativeactivities of the different UGTS in combination will direct synthesistowards a desired proportion of each steviol glycoside.

In some embodiments, the recombinant host further contains and expressesa recombinant GGPPS gene in order to provide increased levels of thediterpene precursor geranylgeranyl diphosphate, for increased fluxthrough the rebaudioside A biosynthetic pathway. In some embodiments,the recombinant host further contains a construct to silence or reducethe expression of non-steviol pathways consuming geranylgeranyldiphosphate, ent-Kaurenoic acid or farnesyl pyrophosphate, therebyproviding increased flux through the steviol and steviol glycosidesbiosynthetic pathways. For example, flux to sterol production pathwayssuch as ergosterol may be reduced by downregulation of the ERG9 gene.See, the ERG9 section below and Examples 24-25. In cells that producegibberellins, gibberellin synthesis may be downregulated to increaseflux of ent-kaurenoic acid to steviol. In carotenoid-producingorganisms, flux to steviol may be increased by downregulation of one ormore carotenoid biosynthetic genes.

In some embodiments, the recombinant host further contains and expressesrecombinant genes involved in diterpene biosynthesis or production ofterpenoid precursors, e.g., genes in the MEP or MEV pathway, havereduced phosphatase activity, and/or express a SUS as discussed herein.

In some embodiments, a recombinant host such as a microorganism producessteviol glycoside compositions that have greater than at least 15%rebaudioside C of the total steviol glycosides, e.g., at least 20%rebaudioside C, 30-40% rebaudioside C, 40-50% rebaudioside C, 50-60%rebaudioside C, 60-70% rebaudioside C, 70-80% rebaudioside C, 80-90%rebaudioside C. In some embodiments, a recombinant host such as amicroorganism produces steviol glycoside compositions that have at least90% rebaudioside C, e.g., 90-99% rebaudioside C. Other steviolglycosides present may include those depicted in FIGS. 2 A and B such assteviol monosides, steviol glucobiosides, steviol rhamnobiosides,rebaudioside A, and Dulcoside A. In some embodiments, the rebaudiosideC-enriched composition produced by the host can be further purified andthe rebaudioside C or Dulcoside A so purified may then be mixed withother steviol glycosides, flavors, or sweeteners to obtain a desiredflavor system or sweetening composition. For instance, a rebaudiosideC-enriched composition produced by a recombinant microorganism can becombined with a rebaudioside A, F, or D-enriched composition produced bya different recombinant microorganism, with rebaudioside A, F, or Dpurified from a Stevia extract, or with rebaudioside A, F, or D producedin vitro.

B. 4 Rebaudioside F Biosynthesis Polypeptides

The biosynthesis of rebaudioside F involves glucosylation andxylosylation of the aglycone steviol. Specifically, rebaudioside F canbe formed by glucosylation of the 13-OH of steviol which formssteviol-13-O-glucoside, xylosylation of the C-2′ of the 13-O-glucose ofsteviol-13-O-glucoside which forms steviol-1,2-xylobioside,glucosylation of the C-19 carboxyl of the 1,2-xylobioside to form1,2-stevioxyloside, and glucosylation of the C-3′ of the C-13-O-glucoseof 1,2-stevioxyloside to form rebaudioside F. The order in which eachglycosylation reaction occurs can vary. See FIG. 2D.

It has been discovered that conversion of steviol to rebaudioside F in arecombinant host can be accomplished by the expression of genes encodingthe following functional UGTs: 85C2, EUGT11 and/or 91D2e, 74G1, and76G1, along with endogenous or recombinantly expressed UDP-glucosedehydrogenase and UDP-glucuronic acid decarboxylase. Thus, a recombinantmicroorganism expressing these four or five UGTs along with endogenousor recombinant UDP-glucose dehydrogenase and UDP-glucuronic aciddecarboxylase can make rebaudioside F when fed steviol in the medium.Alternatively, a recombinant microorganism expressing two functionalUGTs, EUGT11 or 91D2c, and 76G1, can make rebaudioside F when fedrubusoside in the medium. As another alternative, a recombinantmicroorganism expressing a functional UGT 76G1 can make rebaudioside Fwhen fed 1,2 steviorhamnoside. As another alternative, a recombinantmicroorganism expressing 74G1, EUGT11 and/or 91D2c, 76G1, and can makerebaudioside F when fed the monoside, steviol-13-O-glucoside, in themedium. Similarly, conversion of steviol-19-O-glucoside to rebaudiosideF in a recombinant microorganism can be accomplished by the expressionof genes encoding UGTs 85C2, EUGT11 and/or 91D2e, and 76G1, when fedsteviol-19-O-glucoside. Typically, one or more of these genes arerecombinant genes that have been transformed into a host that does notnaturally possess them.

Suitable EUGT11, UGT91D2, UGT74G1, UGT76G1 and UGT85C2 polypeptidesinclude the functional UGT polypeptides discussed herein. In someembodiments, a UGT79B3 polypeptide is substituted for a UGT91, asdiscussed above. UDP-glucose dehydrogenase and UDP-glucuronic aciddecarboxylase provide increased amounts of the UDP-xylose donor forxylosylation of the steviol compound acceptor. Suitable UDP-glucosedehydrogenases and UDP-glucuronic acid decarboxylases include those madeby Arabidopsis thaliana or Cryptococcus neoformans. For example,suitable UDP-glucose dehydrogenase and UDP-glucuronic aciddecarboxylases polypeptides can be encoded by the A. thaliana UGD1 geneand UXS3 gene, respectively. See, Oka and Jigami, FEBS J. 273:2645-2657(2006).

In some embodiments rebaudioside F can be produced using in vitromethods while supplying the appropriate UDP-sugar and/or a cell-freesystem for regeneration of UDP-sugars. See, for example, Jewett M C, etal. Molecular Systems Biology, Vol. 4, article 220 (2008); Masada S etal. FEBS Letters, Vol. 581, 2562-2566 (2007). In some embodiments,sucrose and a sucrose synthase are provided in the reaction vessel inorder to regenerate UDP-glucose from UDP during the glycosylationreactions. See FIG. 11. The sucrose synthase can be from any suitableorganism. For example, a sucrose synthase coding sequence fromArabidopsis thaliana, Stevia rebaudiana, or Coffea arabica can be clonedinto an expression plasmid under control of a suitable promoter, andexpressed in a host, e.g., a microorganism or a plant. In someembodiments, UDP-xylose can be produced from UDP-glucose by supplyingsuitable enzymes, for example, the Arabidopsis thaliana UGD1(UDP-glucose dehydrogenase) and UXS3 (UDP-glucuronic acid decarboxylasc)enzymes along with NAD+ cofactor.

Reactions may be carried out together, or stepwise. For instance,rebaudioside F may be produced from rubusoside with the addition ofstoichiometric amounts of UDP-xylose and EUGT11, followed by addition ofUGT76G1 and an excess or stoichiometric supply of UDP-glucose. In someembodiments, phosphatases are used to remove secondary products andimprove the reaction yields. UGTs and other enzymes for in vitroreactions may be provided in soluble forms or immobilized forms. In someembodiments, rebaudioside F or other steviol xylosides can be producedusing whole cells as discussed above. For example, the cells may containUGT 76G1 and EUGT11 such that mixtures of stevioside and RebA areefficiently converted to RebD. In some embodiments, the whole cells arethe host cells described in section III A.

In other embodiments, the recombinant host expresses one or more genesinvolved in steviol biosynthesis, e.g., a CDPS gene, a KS gene, a KOgene and/or a KAH gene. Thus, for example, a microorganism containing aCDPS gene, a KS gene, a KO gene and a KAH gene, in addition to a EUGT11,UGT85C2, a UGT74G1, an optional UGT91D2 gene, and a UGT76G1 gene, iscapable of producing rebaudioside F without the necessity for includingsteviol in the culture media. In addition, the recombinant hosttypically expresses an endogenous or a recombinant gene encoding aUDP-glucose dehydrogenase and a UDP-glucuronic acid decarboxylase. Suchgenes are useful in order to provide increased amounts of the UDP-xylosedonor for xylosylation of the steviol compound acceptor. SuitableUDP-glucose dehydrogenases and UDP-glucuronic acid decarboxylasesinclude those made by Arabidopsis thaliana or Cryptococcus neoformans.For example, suitable UDP-glucose dehydrogenase and UDP-glucuronic aciddecarboxylases polypeptides can be encoded by the A. thaliana UGD1 geneand UXS3 gene, respectively. See, Oka and Jigami, FEBS J. 273:2645-2657(2006).

One with skill in the art will recognize that by modulating relativeexpression levels of different UGT genes as well as modulating theavailability of UDP-xylose, a recombinant microorganism can be tailoredto specifically produce steviol and steviol glycoside products in adesired proportion. Transcriptional regulation of steviol biosynthesisgenes can be achieved by a combination of transcriptional activation andrepression using techniques known to those in the art. For in vitroreactions, one with skill in the art will recognize that addition ofdifferent levels of UGT enzymes in combination or under conditions whichimpact the relative activities of the different UGTS in combination willdirect synthesis towards a desired proportion of each steviolglycosides.

In some embodiments, the recombinant host further contains and expressesa recombinant GGPPS gene in order to provide increased levels of thediterpene precursor geranylgeranyl diphosphate, for increased fluxthrough the steviol biosynthetic pathway. In some embodiments, therecombinant host further contains a construct to silence the expressionof non-steviol pathways consuming geranylgeranyl diphosphate,ent-Kaurenoic acid or farnesyl pyrophosphate, thereby providingincreased flux through the steviol and steviol glycosides biosyntheticpathways. For example, flux to sterol production pathways such asergosterol may be reduced by downregulation of the ERG9 gene. See, theERG9 section below and Examples 24-25. In cells that producegibberellins, gibberellin synthesis may be downregulated to increaseflux of ent-kaurenoic acid to steviol. In carotenoid-producingorganisms, flux to steviol may be increased by downregulation of one ormore carotenoid biosynthetic genes. In some embodiments, the recombinanthost further contains and expresses recombinant genes involved inditerpene biosynthesis, e.g., genes in the MEP pathway discussed below.

In some embodiments, a recombinant host such as a microorganism producesrebaudioside F-enriched steviol glycoside compositions that have greaterthan at least 4% rebaudioside F by weight total steviol glycosides,e.g., at least 5% rebaudioside F, at least 6% of rebaudioside F, 10-20%rebaudioside F, 20-30% rebaudioside F, 30-40% rebaudioside F, 40-50%rebaudioside F, 50-60% rebaudioside F, 60-70% rebaudioside F, 70-80%rebaudioside F. In some embodiments, a recombinant host such as amicroorganism produces steviol glycoside compositions that have at least90% rebaudioside F, e.g., 90-99% rebaudioside F. Other steviolglycosides present may include those depicted in FIGS. 2A and D such assteviol monosides, steviol glucobiosides, steviol xylobiosides,rebaudioside A, stevioxyloside, rubusoside and stevioside. In someembodiments, the rebaudioside F-enriched composition produced by thehost can be mixed with other steviol glycosides, flavors, or sweetenersto obtain a desired flavor system or sweetening composition. Forinstance, a rebaudioside F-enriched composition produced by arecombinant microorganism can be combined with a rebaudioside A, C, orD-enriched composition produced by a different recombinantmicroorganism, with rebaudioside A, C, or D purified from a Steviaextract, or with rebaudioside A, C, or D produced in vitro.

C. Other Polypeptides

Genes for additional polypeptides whose expression facilitates moreefficient or larger scale production of steviol or a steviol glycosidecan also be introduced into a recombinant host. For example, arecombinant microorganism, plant, or plant cell can also contain one ormore genes encoding a geranylgeranyl diphosphate synthase (GGPPS, alsoreferred to as GGDPS). As another example, the recombinant host cancontain one or more genes encoding a rhamnose synthetase, or one or moregenes encoding a UDP-glucose dehydrogenase and/or a UDP-glucuronic aciddecarboxylasc. As another example, a recombinant host can also containone or more genes encoding a cytochrome P450 reductase (CPR). Expressionof a recombinant CPR facilitates the cycling of NADP+ to regenerateNADPH, which is utilized as a cofactor for terpenoid biosynthesis. Othermethods can be used to regenerate NADHP levels as well. In circumstanceswhere NADPH becomes limiting; strains can be further modified to includeexogenous transhydrogenase genes. See, e.g., Sauer et al., J. Biol.Chem. 279: 6613-6619 (2004). Other methods are known to those with skillin the art to reduce or otherwise modify the ratio of NADH/NADPH suchthat the desired cofactor level is increased.

As another example, the recombinant host can contain one or more genesencoding one or more enzymes in the MEP pathway or the mevalonatepathway. Such genes are useful because they can increase the flux ofcarbon into the diterpene biosynthesis pathway, producing geranylgeranyldiphosphate from isopentenyl diphosphate and dimethylallyl diphosphategenerated by the pathway. The geranylgeranyl diphosphate so produced canbe directed towards steviol and steviol glycoside biosynthesis due toexpression of steviol biosynthesis polypeptides and steviol glycosidebiosynthesis polypeptides.

As another example the recombinant host can contain one or more genesencoding a sucrose synthase, and additionally can contain sucrose uptakegenes if desired. The sucrose synthase reaction can be used to increasethe UDP-glucose pool in a fermentation host, or in a whole cellbioconversion process. This regenerates UDP-glucose from UDP producedduring glycosylation and sucrose, allowing for efficient glycosylation.In some organisms, disruption of the endogenous invertase isadvantageous to prevent degradation of sucrose. For example, the S.cerevisiae SUC2 invertase may be disrupted. The sucrose synthase (SUS)can be from any suitable organism. For example, a sucrose synthasecoding sequence from, without limitation, Arabidopsis thaliana, Steviarebaudiana, or Coffea arabica can be cloned into an expression plasmidunder control of a suitable promoter, and expressed in a host (e.g., amicroorganism or a plant). The sucrose synthase can be expressed in sucha strain in combination with a sucrose transporter (e.g., the A.thaliana SUC1 transporter or a functional homolog thereof) and one ormore UGTs (e.g., one or more of UGT85C2, UGT74G1, UGT76G1, and UGT91D2e,EUGT11 or functional homologs thereof). Culturing the host in a mediumthat contains sucrose can promote production of UDP-glucose, as well asone or more glucosides (e.g., steviol glycosides).

In addition, a recombinant host can have reduced phosphatase activity asdiscussed herein.

C. 1 MEP Biosynthesis Polypeptides

In some embodiments, a recombinant host contains one or more genesencoding enzymes involved in the methylerythritol 4-phosphate (MEP)pathway for isoprenoid biosynthesis. Enzymes in the MEP pathway includedeoxyxylulose 5-phosphate synthase (DXS), D-1-deoxyxylulose 5-phosphatereductoisomerase (DXR), 4-diphosphocytidyl-2-C-methyl-D-erythritolsynthase (CMS), 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase (CMK),4-diphosphocytidyl-2-C-methyl-D-crythritol 2,4-cyclodiphosphate synthase(MCS), 1-hydroxy-2-methyl-2(E)-butenyl 4-diphosphate synthase (HDS) and1-hydroxy-2-methyl-2(E)-butenyl 4-diphosphate reductase (HDR). One ormore DXS genes, DXR genes, CMS genes, CMK genes, MCS genes, HDS genesand/or HDR genes can be incorporated into a recombinant microorganism.See, Rodriguez-Concepción and Boronat, Plant Phys. 130: 1079-1089(2002).

Suitable genes encoding DXS, DXR, CMS, CMK, MCS, HDS and/or HDRpolypeptides include those made by E. coli, Arabidopsis thaliana andSynechococcus leopoliensis. Nucleotide sequences encoding DXRpolypeptides are described, for example, in U.S. Pat. No. 7,335,815.

C. 2 Mevalonate Biosynthesis Polypeptides

In some embodiments, a recombinant host contains one or more genesencoding enzymes involved in the mevalonate pathway for isoprenoidbiosynthesis. Genes suitable for transformation into a host encodeenzymes in the mevalonate pathway such as a truncated3-hydroxy-3-methyl-glutaryl (HMG)-CoA reductase (tHMG), and/or a geneencoding a mevalonate kinase (MK), and/or a gene encoding aphosphomevalonate kinase (PMK), and/or a gene encoding a mevalonatepyrophosphate decarboxylase (MPPD). Thus, one or more HMG-CoA reductasegenes, MK genes, PMK genes, and/or MPPD genes can be incorporated into arecombinant host such as a microorganism.

Suitable genes encoding mevalonate pathway polypeptides are known. Forexample, suitable polypeptides include those made by E. coli, Paracoccusdenitrificans, Saccharomyces cerevisiae, Arabidopsis thaliana,Kitasatospora griseola, Homo sapiens, Drosophila melanogaster, Gallusgallus, Streptomyces sp. KO-3988, Nicotiana attenuata, Kitasatosporagriseola, Hevea brasiliensis, Enterococcus faecium and Haematococcuspluvialis. See, e.g., Table 8 and U.S. Pat. Nos. 7,183,089, 5,460,949,and 5,306,862.

TABLE 8 Sources of HMG CoA Reductases and other Mevalonate Genes SEQ IDSize Gene (codon SEQ ID Accession# Organism Enzyme (nt) name optimized)(protein) XM_001467423 Leishmania Acetyl-CoA C- 1323 MEV-4 103 104infantum acetyltransferase YML075C Saccharomyces Truncated HMG 1584tHMG1 105 106 cerevisiae (tHMG1) EU263989 Ganoderma 3-HMG-CoA 3681MEV-11 107 108 lucidum reductase BC153262 Bos taurus 3-HMG-CoA 2667MEV-12 109 110 reductase AAD47596 Artemisia annua 3-HMG-CoA 1704 MEV-13111 112 reductase AAB62280 Trypanosoma 3-HMG-CoA 1308 MEV-14 113 114cruzi reductase CAG41604 Staph aureus 3-HMG-CoA 1281 MEV-15 115 116reductase DNA2.0 Archaeoglobus 3-HMG-CoA 1311 HMG 117 118 sequencefulgidus reductase reductase DNA2.0 Pseudomonas 3-HMG-CoA 1287 HMG 119120 sequence mevalonii reductase reductase

C.3 Sucrose Synthase Polypeptides

Sucrose synthase (SUS) can be used as a tool for generating UDP-sugar.SUS (EC 2.4.1.13) catalyzes the formation of UDP-glucose and fructosefrom sucrose and UDP (FIG. 11). UDP generated by the reaction of UGTsthus can be converted into UDP-glucose in the presence of sucrose. See,e.g., Chen et al. (2001) J. Am. Chem. Soc. 123:8866-8867; Shao et al.(2003) Appl. Env. Microbiol. 69:5238-5242; Masada et al. (2007) FEBSLett. 581:2562-2566; and Son et al. (2009) J. Microbiol. Biotechnol.19:709-712.

Sucrose synthases can be used to generate UDP-glucose and remove UDP,facilitating efficient glycosylation of compounds in various systems.For example, yeast deficient in the ability to utilize sucrose can bemade to grow on sucrose by introducing a sucrose transporter and a SUS.For example, Saccharomyces cerevisiae does not have an efficient sucroseuptake system, and relies on extracellular SUC2 to utilize sucrose. Thecombination of disrupting the endogenous S. cerevisiae SUC2 invertaseand expressing recombinant SUS resulted in a yeast strain that was ableto metabolize intracellular but not extracellular sucrose (Riesmeier etal. ((1992) EMBO 0.1. 11:4705-4713). The strain was used to isolatesucrose transporters by transformation with a cDNA expression libraryand selection of transformants that had gained the ability to take upsucrose.

As described herein, the combined expression of recombinant sucrosesynthase and a sucrose transporter in vivo can lead to increasedUDP-glucose availability and removal of unwanted UDP. For example,functional expression of a recombinant sucrose synthase, a sucrosetransporter, and a glycosyltransferase, in combination with knockout ofthe natural sucrose degradation system (SUC2 in the case of S.cerevisiae) can be used to generate a cell that is capable of producingincreased amounts of glycosylated compounds such as steviol glycosides.This higher glycosylation capability is due to at least (a) a highercapacity for producing UDP-glucose in a more energy efficient manner,and (b) removal of UDP from growth medium, as UDP can inhibitglycosylation reactions.

The sucrose synthase can be from any suitable organism. For example, asucrose synthase coding sequence from, without limitation, Arabidopsisthaliana, Stevia rebaudiana, or Coffea arabica (see, e.g., FIGS.19A-19C, SEQ ID NOs:178, 179, and 180) can be cloned into an expressionplasmid under control of a suitable promoter, and expressed in a host(e.g., a microorganism or a plant). As described in the Examples herein,a SUS coding sequence may be expressed in a SUC2 (sucrose hydrolyzingenzyme) deficient S. cerevisiae strain, so as to avoid degradation ofextracellular sucrose by the yeast. The sucrose synthase can beexpressed in such a strain in combination with a sucrose transporter(e.g., the A. thaliana SUC1 transporter or a functional homolog thereof)and one or more UGTs (e.g., one or more of UGT85C2, UGT74G1, UGT76G1,EUGT11, and UGT91D2e, or functional homologs thereof). Culturing thehost in a medium that contains sucrose can promote production ofUDP-glucose, as well as one or more glucosides (e.g., steviolglucoside). It is to be noted that in some cases, a sucrose synthase anda sucrose transporter can be expressed along with a UGT in a host cellthat also is recombinant for production of a particular compound (e.g.,steviol).

C. 4 Modulation of ERG9 Activity

It is an object of the disclosure to produce terpenoids based on theconcept of increasing the accumulation of terpenoid precursors of thesqualene pathway. Non-limiting examples of terpenoids includeHemiterpenoids, 1 isoprene unit (5 carbons); Monoterpenoids, 2 isopreneunits (10C); Sesquiterpenoids, 3 isoprene units (15C); Diterpenoids, 4isoprene units (20C) (e.g. ginkgolides); Triterpenoids, 6 isoprene units(30C); Tetraterpenoids, 8 isoprene units (40C) (e.g. carotenoids); andpolyterpenoid with a larger number of isoprene units.

Hemiterpenoids include isoprene, prenol and isovaleric acid.Monoterpenoids include Geranyl pyrophosphate, Eucalyptol, Limonene andPinene. Sesquiterpenoids include Famesyl pyrophosphate, Artemisinin andBisabolol. Diterpenoids include Geranylgeranyl pyrophosphate, steviol,Retinol, Retinal, Phytol, Taxol, Forskolin and Aphidicolin.Triterpenoids include Squalene and Lanosterol. Tetraterpenoids includeLycopene and Carotene.

Terpenes are hydrocarbons resulting from the combination of severalisoprene units. Terpenoids can be thought of as terpene derivatives. Theterm terpene is sometimes used broadly to include the terpenoids. Justlike terpenes, the terpenoids can be classified according to the numberof isoprene units used. The present invention is focussed on terpenoidsand in particular terpenoids derived through the squalene pathway fromthe precursors Farnesyl-pyrophosphate (FPP), Isopentenyl-pyrophosphate(IPP), Dimethylallyl-pyrophosphate (DMAPP), Geranyl-pyrophosphate (GPP)and/or Geranylgeranyl-pyrophosphate (GGPP).

By terpenoids is understood terpenoids of the Hemiterpenoid class suchas but not limited to isoprene, prenol and isovaleric acid; terpenoidsof the Monoterpenoid class such as but not limited to geranylpyrophosphate, eucalyptol, limonene and pinene; terpenoids of theSesquiterpenoids class such as but not limited to farnesylpyrophosphate, artemisinin and bisabolol; terpenoids of the diterpenoidclass such as but not limited to geranylgeranyl pyrophosphate, steviol,retinol, retinal, phytol, taxol, forskolin and aphidicolin; terpenoidsof the Triterpenoid class such as but not limited to lanosterol;terpenoids of the Tetraterpenoid class such as but not limited tolycopene and carotene.

In one embodiment the invention relates to production of terpenoids,which are biosynthesized from Geranylgeranyl-pyrophosphate (GGPP). Inparticular such terpenoids may be steviol.

In one embodiment the invention relates to production of terpenoids,which are biosynthesized from Geranylgeranyl-pyrophosphate (GGPP). Inparticular such terpenoids may be steviol.

The cell

The present invention relates to a cell, such as any of the hostsdescribed in section III, modified to comprise the construct depicted inFIG. 22. Accordingly, in a main aspect, the present invention relates toa cell comprising a nucleic acid, said nucleic acid comprising

i) a promoter sequence operably linked to

ii) a heterologous insert sequence operably linked to

iii) an open reading frame operably linked to

iv) a transcription termination signal,

wherein the heterologous insert sequence has the general formula (I):

-X₁-X₂-X₃-X₄-X₅-

wherein X₂ comprises at least 4 consecutive nucleotides beingcomplementary to, and forms a hairpin secondary structure element withat least 4 consecutive nucleotides of X₄, and

wherein X₃ is optional and if present comprises nucleotides involved informing a hairpin loop between X₂ and X₄, and

wherein X₁ and X₅ individually and optionally comprise one or morenucleotides, and

wherein the open reading frame upon expression encodes a squalenesynthase (EC 2.5.1.21), e.g., a polypeptide sequence having at least 70%identity to a squalene synthase (EC 2.5.1.21) or a biologically activefragment thereof, said fragment having at least 70% sequence identity tosaid squalene synthase in a range of overlap of at least 100 aminoacids.

In addition to above mentioned nucleic acid comprising a heterologousinsert sequence, the cell may also comprise one or more additionalheterologous nucleic acid sequences (e.g., nucleic acids encoding any ofthe steviol and steviol glycoside biosynthesis polypeptides of sectionI). In one preferred embodiment the cell comprises a heterologousnucleic acid encoding GGPPS operably linked to a nucleic acid sequencedirecting expression of GGPPS in said cell.

Heterologous Insert Sequence

The heterologous insert sequence can adapt the secondary structureelement of a hairpin with a hairpin loop. The hairpin part comprisessections X₂ and X₄ which are complementary and hybridize to one another.Sections X₂ and X₄ flank section X₃, which comprises nucleotides thatform a loop—the hairpin loop. The term complementary is understood bythe person skilled in the art as meaning two sequences compared to eachother, nucleotide by nucleotide counting from the 5′ end to the 3′ end,or vice versa.

The heterologous insert sequence is long enough to allow a hairpin to becompleted, but short enough to allow limited translation of an ORF thatis present in-frame and immediately 3′ to the heterologous insertsequence. Thus, in one embodiment, the heterologous insert sequencecomprises 10-50 nucleotides, preferably 10-30 nucleotides, morepreferably 15-25 nucleotides, more preferably 17-22 nucleotides, morepreferably 18-21 nucleotides, more preferably 18-20 nucleotides, morepreferably 19 nucleotides.

X₂ and X₄ may individually consist of any suitable number ofnucleotides, so long as a consecutive sequence of at least 4 nucleotidesof X₂ is complementary to a consecutive sequence of at least 4nucleotides of X₄. In a preferred embodiment X₂ and X₄ consist of thesame number of nucleotides.

X₂ may for example consist of in the range of 4 to 25, such as in therange of 4 to 20, for example of in the range of 4 to 15, such as in therange of 6 to 12, for example in the range of 8 to 12, such as in therange of 9 to 11 nucleotides.

X₄ may for example consist of in the range of 4 to 25, such as in therange of 4 to 20, for example of in the range of 4 to 15, such as in therange of 6 to 12, for example in the range of 8 to 12, such as in therange of 9 to 11 nucleotides.

In one preferred embodiment, X₂ consists of a nucleotide sequence, whichis complementary to the nucleotide sequence of X₄, i.e. it is preferredthat all nucleotides of X₂ are complementary to the nucleotide sequenceof X₄.

In one preferred embodiment X₄ consists of a nucleotide sequence, whichis complementary to the nucleotide sequence of X₂, i.e. it is preferredthat all nucleotides of X₄ are complementary to the nucleotide sequenceof X₂. Very preferably, X₂ and X₄ consists of the same number ofnucleotides, wherein X₂ is complementary to X₄ over the entire length ofX₂ and X₄.

X₃ may be absent, i.e., X₃ may consist of zero nucleotides. It is alsopossible that X₃ consists of in the range of 1 to 5, such as in therange of 1 to 3 nucleotides.

X₁ may be absent, i.e., X₁ may consist of zero nucleotides. It is alsopossible that X₁ consists of in the range of 1 to 25, such as in therange of 1 to 20, for example in the range of 1 to 15, such as in therange of 1 to 10, for example in the range of 1 to 5, such as in therange of 1 to 3 nucleotides.

X₅ may be absent, i.e., X₅ may consist of zero nucleotides. It is alsopossible that X₅ may consist of in the range 1 to 5, such as in therange of 1 to 3 nucleotides.

The sequence may be any suitable sequence fulfilling the requirementsdefined herein above. Thus, the heterologous insert sequence maycomprise a sequence selected from the group consisting of SEQ ID NO:181, SEQ ID NO:182, SEQ ID NO:183, and SEQ ID NO: 184. In a preferredembodiment the insert sequence is selected from the group consisting ofSEQ ID NO: 181, SEQ ID NO:182, SEQ ID NO:183, and SEQ ID NO:184.

Squalene Synthase

Squalene synthase (SQS) is the first committed enzyme of thebiosynthesis pathway that leads to the production of sterols. Itcatalyzes the synthesis of squalene from farnesyl pyrophosphate via theintermediate presqualene pyrophosphate. This enzyme is a critical branchpoint enzyme in the biosynthesis of terpenoids/isoprenoids and isthought to regulate the flux of isoprene intermediates through thesterol pathway. The enzyme is sometimes referred to asfarnesyl-diphosphate farnesyltransferase (FDFT1).

The mechanism of SQS is to convert two units of farnesyl pyrophosphateinto squalene.

SQS is considered to be an enzyme of eukaryotes or advanced organisms,although at least one prokaryote has been shown to possess afunctionally similar enzyme.

In terms of structure and mechanics, squalene synthase most closelyresembles phytoene syntase, which serves a similar role in many plantsin the elaboration of phytoene, a precursor of many carotenoidcompounds.

A high level of sequence identity indicates likelihood that the firstsequence is derived from the second sequence. Amino acid sequenceidentity requires identical amino acid sequences between two alignedsequences. Thus, a candidate sequence sharing 70% amino acid identitywith a reference sequence requires that, following alignment, 70% of theamino acids in the candidate sequence are identical to the correspondingamino acids in the reference sequence. Identity may be determined by aidof computer analysis, such as, without limitations, the ClustalWcomputer alignment program as described in section D. Using this programwith its default settings, the mature (bioactive) part of a query and areference polypeptide are aligned. The number of fully conservedresidues are counted and divided by the length of the referencepolypeptide. The ClustalW algorithm may similarly be used to alignnucleotide sequences. Sequence identities may be calculated in a similarway as indicated for amino acid sequences.

In one important embodiment, the cell of the present invention comprisesa nucleic acid sequence coding, as defined herein, upon expression for asqualene synthase wherein the squalene synthase is at least 75%, such asat least 76%, such as at least 77%, such as at least 78%, such as atleast 79%, such as at least 80%, such as at least 81%, such as at least82%, such as at least 83%, such as at least 84%, such as at least 85%,such as at least 86%, such as at least 87%, such as at least 88%, suchas at least 89%, such as at least 90%, such as at least 91%, such as atleast 92%, such as at least 93%, such as at least 94%, such as at least95%, such as at least 96%, such as at least 97%, such as at least 98%,such as at least 99%, such as at least 99.5%, such as at least 99.6%,such as at least 99.7%, such as at least 99.8%, such as at least 99.9%,such as 100% identical to a squalene synthase wherein the squalenesynthase is selected from the group consisting of SEQ ID NO:192, SEQ IDNO:193, SEQ ID NO:194, SEQ ID NO:195, SEQ ID NO: 196, SEQ ID NO: 197,SEQ ID NO: 198, SEQ ID NO: 199, SEQ ID NO:200, SEQ ID NO:201, and SEQ IDNO:202.

Promoter

A promoter is a region of DNA that facilitates the transcription of aparticular gene. Promoters are located near the genes they regulate, onthe same strand and typically upstream (towards the 5′ region of thesense strand). In order for the transcription to take place, the enzymethat synthesizes RNA, known as RNA polymerase, must attach to the DNAnear a gene. Promoters contain specific DNA sequences and responseelements which provide a secure initial binding site for RNA polymeraseand for proteins called transcription factors that recruit RNApolymerase. These transcription factors have specific activator orrepressor sequences of corresponding nucleotides that attach to specificpromoters and regulate gene expressions.

In bacteria, the promoter is recognized by RNA polymerase and anassociated sigma factor, which in turn are often brought to the promoterDNA by an activator protein binding to its own DNA binding site nearby.In eukaryotes the process is more complicated, and at least sevendifferent factors are necessary for the binding of an RNA polymerase IIto the promoter. Promoters represent critical elements that can work inconcert with other regulatory regions (enhancers, silencers, boundaryelements/insulators) to direct the level of transcription of a givengene.

As promoters are normally immediately adjacent to the open reading frame(ORF) in question, positions in the promoter are designated relative tothe transcriptional start site, where transcription of RNA begins for aparticular gene (i.e., positions upstream are negative numbers countingback from −1, for example −100 is a position 100 base pairs upstream).

Promoter elements

-   -   Core promoter—the minimal portion of the promoter required to        properly initiate transcription        -   Transcription Start Site (TSS)        -   Approximately −35 bp upstream and/or downstream of the start            site        -   A binding site for RNA polymerase            -   RNA polymerase I: transcribes genes encoding ribosomal                RNA            -   RNA polymerase II: transcribes genes encoding messenger                RNA and certain small nuclear RNAs            -   RNA polymerase III: transcribes genes encoding tRNAs and                other small RNAs        -   General transcription factor binding sites    -   Proximal promoter—the proximal sequence upstream of the gene        that tends to contain primary regulatory elements        -   Approximately −250 bp upstream of the start site        -   Specific transcription factor binding sites    -   Distal promoter—the distal sequence upstream of the gene that        may contain additional regulatory elements, often with a weaker        influence than the proximal promoter        -   Anything further upstream (but not an enhancer or other            regulatory region whose influence is positional/orientation            independent)        -   Specific transcription factor binding sites

Prokaryotic Promoters

In prokaryotes, the promoter consists of two short sequences at −10 and−35 positions upstream from the transcription start site. Sigma factorsnot only help in enhancing RNAP binding to the promoter but also helpRNAP target specific genes to transcribe.

The sequence at −10 is called the Pribnow box, or the −10 element, andusually consists of the six nucleotides TATAAT. The Pribnow box isessential to start transcription in prokaryotes.

The other sequence at −35 (the −35 element) usually consists of theseven nucleotides TTGACAT. Its presence allows a very high transcriptionrate.

Both of the above consensus sequences, while conserved on average, arenot found intact in most promoters. On average only 3 of the 6 basepairs in each consensus sequence is found in any given promoter. Nopromoter has been identified to date that has intact consensus sequencesat both the −10 and −35; artificial promoters with complete conservationof the −10/−35 hexamers has been found to promote RNA chain initiationat very high efficiencies.

Some promoters contain a UP element (consensus sequence5′-AAAWWTWTTTTNNNAAANNN-3′; W=A or T; N=any base) centered at −50; thepresence of the −35 element appears to be unimportant for transcriptionfrom the UP element-containing promoters.

Eukayotic Promoters

Eukaryotic promoters are typically located upstream of the gene (ORF)and can have regulatory elements several kilobases (kb) away from thetranscriptional start site. In eukaryotes, the transcriptional complexcan cause the DNA to fold back on itself, which allows for placement ofregulatory sequences far from the actual site of transcription. Manyeukaryotic promoters, contain a TATA box (sequence TATAAA), which inturn binds a TATA binding protein which assists in the formation of theRNA polymerase transcriptional complex. The TATA box typically lies veryclose to the transcriptional start site (often within 50 bases).

The cell of the present invention comprises a nucleic acid sequencewhich comprises a promoter sequence. The promoter sequence is notlimiting for the invention and can be any promoter suitable for the hostcell of choice.

In one embodiment of the present invention the promoter is aconstitutive or inducible promoter.

In a further embodiment of the invention, the promoter is selected fromthe group consisting of an endogenous promoter, PGK-1, GPD1, PGK1, ADH1,ADH2, PYK1, TPI1, PDC1, TEF1, TEF2, FBA1, GAL1-10, CUP1, MET2, MET14,MET25, CYC1, GAL1-S, GAL1-L, TEF1, ADH1, CAG, CMV, human UbiC, RSV,EF-1alpha, SV40, Mt1, Tet-On, Tet-Off, Mo-MLV-LTR, Mx1, progesterone,RU486 and Rapamycin-inducible promoter.

Post-Transcriptional Regulation

Post-transcriptional regulation is the control of gene expression at theRNA level, therefore between the transcription and the translation ofthe gene.

The first instance of regulation is at transcription (transcriptionalregulation) where due to the chromatin arrangement and due to theactivity of transcription factors, genes are differentially transcribed.

After being produced, the stability and distribution of the differenttranscripts is regulated (post-transcriptional regulation) by means ofRNA binding protein (RBP) that control the various steps and rates ofthe transcripts: events such as alternative splicing, nucleardegradation (exosome), processing, nuclear export (three alternativepathways), sequestration in DCP2-bodies for storage or degradation, andultimately translation. These proteins achieve these events thanks to aRNA recognition motif (RRM) that binds a specific sequence or secondarystructure of the transcripts, typically at the 5′ and 3′ UTR of thetranscript.

Modulating the capping, splicing, addition of a Poly(A) tail, thesequence-specific nuclear export rates and in several contextssequestration of the RNA transcript occurs in eukaryotes but not inprokaryotes. This modulation is a result of a protein or transcriptwhich in turn is regulated and may have an affinity for certainsequences.

Capping

Capping changes the five prime end of the mRNA to a three prime end by5′-5′ linkage, which protects the mRNA from 5′ exonuclease, whichdegrades foreign RNA.

The cap also helps in ribosomal binding.

Splicing Splicing removes the introns, noncoding regions that aretranscribed into RNA, in order to make the mRNA able to create proteins.Cells do this by spliceosomes binding on either side of an intron,looping the intron into a circle and then cleaving it off. The two endsof the exons are then joined together.

Polyadenylation

Polyadenylation is the addition of a poly(A) tail to the 3′ end, i.e.the poly(A) tail consists of multiple adenosine monophosphates. Thepoly-A sequence acts as a buffer to the 3′ exonuclease and thusincreases half-life of mRNA. In addition, a long poly(A) tail canincrease translation. Thus the poly-A tail may be used to furthermodulate translation of the construct of the present invention, in orderto arrive at the optimal translation rate.

In eukaryotes, polyadenylation is part of the process that producesmature messenger RNA (mRNA) for translation.

The poly(A) tail is also important for the nuclear export, translation,and stability of mRNA.

In one embodiment the nucleic acid sequence of the cell of the presentinvention, as defined herein above, further comprises apolyadenyl/polyadenylation sequence, preferably the 5′ end of saidpolyadenyl 1/polyadenylation sequence is operably linked to the 3′ endof the open reading frame, such as to the open reading frame encodingsqualene synthase.

RNA Editing

RNA editing is a process which results in sequence variation in the RNAmolecule, and is catalyzed by enzymes. These enzymes include theAdenosine Deaminase Acting on RNA (ADAR) enzymes, which convert specificadenosine residues to inosine in an mRNA molecule by hydrolyticdeamination. Three ADAR enzymes have been cloned, ADAR1, ADAR2 andADAR3, although only the first two subtypes have been shown to have RNAediting activity. Many mRNAs are vulnerable to the effects of RNAediting, including the glutamate receptor subunits GluR2, GluR3, GluR4,GluR5 and GluR6 (which are components of the AMPA and kainatereceptors), the serotonin2C receptor, the GABA-alpha3 receptor subunit,the tryptophan hydroxlase enzyme TPH2, the hepatitis delta virus andmore than 16% of microRNAs. In addition to ADAR enzymes, CDAR enzymesexist and these convert cytosines in specific RNA molecules, to uracil.These enzymes are termed ‘APOBEC’ and have genetic loci at 22q13, aregion close to the chromosomal deletion which occurs invelocardiofacial syndrome (22q11) and which is linked to psychosis. RNAediting is extensively studied in relation to infectious diseases,because the editing process alters viral function.

Post-Transcriptional Regulatory Elements

Use of a post-transcriptional regulatory elements (PRE) is oftennecessary to obtain vectors with sufficient performance for certainapplications. Schambach et al in Gene Ther. (2006) 13(7):641-5 reportsthat introduction of a post-transcriptional regulatory element (PRE) ofwoodchuck hepatitis virus (WHV) into the 3′ untranslated region ofretroviral and lentiviral gene transfer vectors enhances both titer andtransgene expression. The enhancing activity of the PRE depends on theprecise configuration of its sequence and the context of the vector andcell into which it is introduced.

Thus use of a PRE such as a woodchuck hepatitis viruspost-transcriptional regulatory elements (WPRE) may be useful in thepreparation of the cell of the present invention when using a genetherapeutic approach.

Accordingly, in one embodiment the nucleic acid sequence of the celldefined herein further comprises a post-transcriptional regulatoryelement.

In a further embodiment, the post-transcriptional regulatory element isa Woodchuck hepatitis virus post-transcriptional regulatory element(WPRE).

Terminal Repeats

To insert genetic sequences into host DNA, viruses often use sequencesof DNA that repeats up to thousands of times, so called repeats, orterminal repeats including long terminal repeats (LTR) and invertedterminal repeats (ITR), wherein said repeat sequences may be both 5′ and3′ terminal repeats. ITRs aid in concatamer formation in the nucleusafter the single-stranded vector DNA is converted by host cell DNApolymerase complexes into double-stranded DNA. ITR sequences may bederived from viral vectors, such as AAV, e.g. AAV2.

In one embodiment, the nucleic acid sequence or the vector of the celldefined herein comprises a 5′ terminal repeat and a 3′ terminal repeat.

In one embodiment said 5′ and 3′ terminal repeats are selected fromInverted Terminal Repeats [ITR] and Long Terminal Repeats [LTR].

In one embodiment of said 5′ and 3′ terminal repeats are AAV InvertedTerminal Repeats [ITR].

Geranylgeranyl Pyrophosphate Synthase

The microbial cells of the present invention may in preferredembodiments contain a heterologous nucleic acid sequence encodingGeranylgeranyl Pyrophosphate Synthase (GGPPS). See, e.g., Table 7. GGPPSis an enzyme, which catalyzes the chemical reaction that turns onefarnesyl pyrophosphate (FPP) molecule into one GeranylgeranylPyrophosphate (GGPP) molecule. Genes encoding GGPPS may for example befound in organisms that contain the mevalonate pathway.

The GGPPS to be used with the present invention may be any usefulenzyme, which is capable of catalysing conversion of a farnesylpyrophosphate (FPP) molecule into a Geranylgeranyl Pyrophosphate (GGPP)molecule. In particular, the GGPPS to be used with the present inventionmay be any enzyme capable of catalysing the following reaction:

(2E,6E)-farnesyl diphosphate+isopentenyldiphosphate->diphosphate+geranylgeranyl diphosphate.

It is preferred that the GGPPS used with the present invention is anenzyme categorised under EC 2.5.1.29.

The GGPPS may be GGPPS from a variety of sources, such as from bacteria,fungi or mammals. The GGPPS may be any kind of GGPPS, for exampleGGPPS-1, GGPPS-2, GGPPS-3 or GGPPS-4. The GGPPS may be wild type GGPPSor a functional homologue thereof.

For example, the GGPPS may be GGPPS-1 of S. acidicaldarius (SEQ ID NO:126), GGPPS-2 of A. nidulans (SEQ ID NO: 203), GGPPS-3 of S. cerevisiae(SEQ ID NO: 167) or GGPPS-4 of M. musculus (SEQ ID NO:123) or afunctional homologue of any of the aforementioned.

The heterologous nucleic acid encoding said GGPPS may be any nucleicacid sequence encoding said GGPPS. Thus, in embodiments of the inventionwhere GGPPS is a wild type protein, the nucleic acid sequence may forexample be a wild type cDNA sequence encoding said protein. However, itis frequently the case that the heterologous nucleic acid is nucleicacid sequence encoding any particular GGPPS, where said nucleic acid hasbeen codon optimised for the particular microbial cell. Thus, by way ofexample, if the microbial cell is S. cerevisiae, then the nucleic acidencoding GGPPS has preferably been codon optimised for optimalexpression in S. cerevisiae.

Functional homologues of GGPPS are preferably protein havingabove-mentioned activity and sharing at least 70% amino acid identitywith the sequence of a reference GGPPS. Methods for determining sequenceidentity are described herein above in the section “Squalene synthase”and in section D.

In one embodiment, the cell, such as the microbial of the presentinvention comprises a nucleic acid sequence coding a GGPPS or afunctional homologue thereof, where said functional homologue is atleast 75%, such as at least 76%, such as at least 77%, such as at least78%, such as at least 79%, such as at least 80%, such as at least 81%,such as at least 82%, such as at least 83%, such as at least 84%, suchas at least 85%, such as at least 86%, such as at least 87%, such as atleast 88%, such as at least 89%, such as at least 90%, such as at least910%, such as at least 92%, such as at least 93%, such as at least 94%,such as at least 95%, such as at least 96%, such as at least 97%, suchas at least 98%, such as at least 99%, such as at least 99.5%, such asat least 99.6%, such as at least 99.7%, such as at least 99.8%, such asat least 99.9%, such as 100% identical to a GGPPS selected from thegroup consisting of SEQ ID NO:123, SEQ ID NO:126, SEQ ID NO:167 and SEQID NO:203.

Said heterologous nucleic acid sequence encoding a GGPPS is in generaloperably linked to a nucleic acid sequence directing expression of GGPPSin the microbial cell. The nucleic acid sequence directing expression ofGGPPS in the microbial cell may be a promoter sequence, and preferablysaid promoter sequence is selected according the particular microbialcell. The promoter may for example be any of the promoters describedherein above in the section “Promoter”

Vectors

A vector is a DNA molecule used as a vehicle to transfer foreign geneticmaterial into another cell. The major types of vectors are plasmids,viruses, cosmids, and artificial chromosomes. Common to all engineeredvectors is an origin of replication, a multicloning site, and aselectable marker.

The vector itself is generally a DNA sequence that consists of an insert(transgene) and a larger sequence that serves as the “backbone” of thevector. The purpose of a vector which transfers genetic information toanother cell is typically to isolate, multiply, or express the insert inthe target cell. Vectors called expression vectors (expressionconstructs) specifically are for the expression of the transgene in thetarget cell, and generally have a promoter sequence that drivesexpression of the transgene. Simpler vectors called transcriptionvectors are only capable of being transcribed but not translated: theycan be replicated in a target cell but not expressed, unlike expressionvectors. Transcription vectors are used to amplify their insert.

Insertion of a vector into the target cell is usually calledtransformation for bacterial cells, transfection for eukaryotic cells,although insertion of a viral vector is often called transduction.

Plasmids

Plasmid vectors are double-stranded generally circular DNA sequencesthat are capable of automatically replicating in a host cell. Plasmidvectors minimalistically consist of an origin of replication that allowsfor semi-independent replication of the plasmid in the host and also thetransgene insert. Modern plasmids generally have many more features,notably including a “multiple cloning site” which includes nucleotideoverhangs for insertion of an insert, and multiple restriction enzymeconsensus sites to either side of the insert. In the case of plasmidsutilized as transcription vectors, incubating bacteria with plasmidsgenerates hundreds or thousands of copies of the vector within thebacteria in hours, and the vectors can be extracted from the bacteria,and the multiple cloning site can be cut by restriction enzymes toexcise the hundredfold or thousandfold amplified insert. These plasmidtranscription vectors characteristically lack crucial sequences thatcode for polyadenylation sequences and translation termination sequencesin translated mRNAs, making protein expression from transcriptionvectors impossible. plasmids may be conjugative/transmissible andnon-conjugative:

-   -   conjugative: mediate DNA transfer through conjugation and        therefore spread rapidly among the bacterial cells of a        population; e.g., F plasmid, many R and some col plasmids.    -   nonconjugative—do not mediate DNA through conjugation, e.g.,        many R and col plasmids.

Viral Vectors

Viral vectors are generally genetically-engineered viruses carryingmodified viral DNA or RNA that has been rendered noninfectious, butstill contain viral promoters and also the transgene, thus allowing fortranslation of the transgene through a viral promoter. However, becauseviral vectors frequently are lacking infectious sequences, they requirehelper viruses or packaging lines for large-scale transfection. Viralvectors are often designed for permanent incorporation of the insertinto the host genome, and thus leave distinct genetic markers in thehost genome after incorporating the transgene. For example, retrovirusesleave a characteristic retroviral integration pattern after insertionthat is detectable and indicates that the viral vector has incorporatedinto the host genome.

In one embodiment the invention concerns a viral vector capable oftransfecting a host cell, such as a cell that can be cultured, e.g. ayeast cell or any other suitable eukaryotic cell. The vector is thencapable of transfecting said cell with a nucleic acid that includes theheterologous insert sequence as described herein.

The viral vector can be any suitable viral vector such as a viral vectorselected from the group consisting of vectors derived from theRetroviridae family including lentivirus, HIV, SIV, FIV, EAIV, CIV.

The viral vector may also be selected from the group consisting ofalphavirus, adenovirus, adeno associated virus, baculovirus, HSV,coronavirus, Bovine papilloma virus, Mo-MLV and adeno associated virus.

In embodiments of the invention wherein the microbial cell comprises aheterologous nucleic acid encoding GGPPS, then said heterologous nucleicacid may be positioned on the vector also containing the nucleic acidencoding squalene synthase, or the heterologous nucleic acid encodingGGPPS may be positioned on a different vector. Said heterologous nucleicacid encoding GGPPS may be contained in any of the vectors describedherein above.

In embodiments of the invention wherein the microbial cell comprises aheterologous nucleic acid encoding HMCR, then said heterologous nucleicacid may be positioned on the vector also containing the nucleic acidencoding squalene synthase, or the heterologous nucleic acid encodingHMCR may be positioned on a different vector. Said heterologous nucleicacid encoding HMCR may be contained in any of the vectors describedherein above. It is also contained within the invention that theheterologous nucleic acid encoding GGPPS and the heterologous nucleiacid encoding HMCR may be positioned on the same or on individualvectors.

Transcription

Transcription is a necessary component in all vectors: the premise of avector is to multiply the insert (although expression vectors later alsodrive the translation of the multiplied insert). Thus, even stableexpression is determined by stable transcription, which generallydepends on promoters in the vector. However, expression vectors have avariety of expression patterns: constitutive (consistent expression) orinducible (expression only under certain conditions or chemicals). Thisexpression is based on different promoter activities, notpost-transcriptional activities. Thus, these two different types ofexpression vectors depend on different types of promoters.

Viral promoters are often used for constitutive expression in plasmidsand in viral vectors because they normally reliably force constanttranscription in many cell lines and types.

Inducible expression depends on promoters that respond to the inductionconditions: for example, the murine mammary tumor virus promoter onlyinitiates transcription after dexamethasone application and theDrosophilia heat shock promoter only initiates after high temperatures.transcription is the synthesis of mRNA. Genetic information is copiedfrom DNA to RNA

Expression

Expression vectors require sequences that encode for e.g.polyadenylation tail (see herein above): Creates a polyadenylation tailat the end of the transcribed pre-mRNA that protects the mRNA fromexonucleases and ensures transcriptional and translational termination:stabilizes mRNA production.

Minimal UTR length: UTRs contain specific characteristics that mayimpede transcription or translation, and thus the shortest UTRs or noneat all are encoded for in optimal expression vectors.

Kozak sequence: Vectors should encode for a Kozak sequence in the mRNA,which assembles the ribosome for translation of the mRNA.

Above conditions are necessary for expression vectors in eukaryotes butnot in prokaryotes.

Modern vectors may encompass additional features besides the transgeneinsert and a backbone such as a promoter (discussed above), geneticmarkers to e.g. allow for confirmation that the vector has integratedwith the host genomic DNA, antibiotic resistance genes for antibioticselection, and affinity tags for purification.

In one embodiment the cell of the present invention comprises a nucleicacid sequence integrated in a vector such as an expression vector.

In one embodiment the the vector is selected from the group consistingof plasmid vectors, cosmids, artificial chromosomes and viral vectors.

The plasmid vector should be able to be maintained and replicated inbacteria, fungi and yeast.

The present invention also concerns cells comprising plasmid and cosmidvectors as well as artificial chromosome vectors.

The important factor is that the vector is functional and that thevector comprises at least the nucleic acid sequence comprising theheterologous insert sequence as described herein.

In one embodiment the vector is functional in fungi and in mammaliancells.

In one embodiment the invention concerns a cell transformed ortransduced with the vector as defined herein above.

Methods for Producing Terpenoids

As mentioned herein above, the cell of the present invention (e.g.,recombinant host cells) is useful in enhancing yield of industriallyrelevant terpenoids.

The cell of the invention may therefore be used in various set-ups inorder to increase accumulation of terpenoid precursors and thus toincrease yield of terpenoid products resulting from enzymatic conversionof said (upstream) terpenoid precursors.

Accordingly, in one aspect the present invention relates to a method forproducing a terpenoid compound synthesized through the squalene pathway,in a cell culture, said method comprising the steps of

(a) providing the cell as defined herein above,

(b) culturing the cell of (a).

(c) recovering the terpenoid product compound.

By providing the cell comprising the genetically modified constructdefined herein above, the accumulation of terpenoid precursors isenhanced (FIG. 20).

Thus, in another aspect, the invention relates to a method for producinga terpenoid derived from a terpenoid precursor selected from the groupconsisting of Farnesyl-pyrophosphate (FPP), Isopentenyl-pyrophosphate(IPP), Dimethylallyl-pyrophosphate (DMAPP), Geranyl-pyrophosphate (GPP)and/or Geranylgeranyl-pyrophosphate (GGPP), said method comprising:

-   -   (a) contacting said precursor with an enzyme of the squalene        synthase pathway,    -   (b) recovering the terpenoid product.

In one embodiment the terpenoid (product) of the method of the presentinvention as defined herein above, is selected from the group consistingof hemiterpenoids, monoterpenes, sesquiterpenoids, diterpenoids,sesterpenes, triterpenoids, tetraterpenoids and polyterpenoids.

In a further embodiment the terpenoid is selected from the groupconsisting of farnesyl phosphate, farnesol, geranylgeranyl,geranylgeraniol, isoprene, prenol, isovaleric acid, geranylpyrophosphate, eucalyptol, limonene, pinene, farnesyl pyrophosphate,artemisinin, bisabolol, geranylgeranyl pyrophosphate, retinol, retinal,phytol, taxol, forskolin, aphidicolin, lanosterol, lycopene andcarotene.

The terpenoid product can be used as starting point in an additionalrefining process. Thus, in one embodiment said method further comprisesdephosphorylating the farnesyl phosphate to produce farnesol.

The enzyme or enzymes used in the process of preparing the targetproduct terpenoid compound is preferably an enzyme “located downstream”of the terpenoid precursors Farnesyl-pyrophosphate,Isopentenyl-pyrophosphate, Dimethylallyl-pyrophosphate,Geranyl-pyrophosphate and Geranylgeranyl-pyrophosphate such as an enzymelocated downstream of the terpenoid precursors Farnesyl-pyrophosphate,Isopentenyl-pyrophosphate, Dimethylallyl-pyrophosphate,Geranyl-pyrophosphate and Geranylgeranyl-pyrophosphate depicted in thesqualene pathway of FIG. 20. The enzyme used in the process of preparingthe target product terpenoid, based on the accumulation of precursorsachieved through the present invention, may thus be selected from thegroup consisting of Dimethylallyltransferase (EC 2.5.1.1), Isoprenesynthase (EC 4.2.3.27) and Geranyltranstransferase (EC 2.5.1.10).

The present invention may operate by at least partly, stericallyhindering binding of the ribosome to the RNA thus reducing thetranslation of squalene synthase.

Accordingly, in one aspect the present invention relates to a method forreducing the translation rate of a functional squalene synthase (EC2.5.1.21) said method comprising:

-   -   (a) providing the cell defined herein above,    -   (b) culturing the cell of (a).

Similarly, the invention in another aspect relates to a method fordecreasing turnover of farnesyl-pp to squalene, said method comprising:

-   -   (a) providing the cell defined herein above,    -   (b) culturing the cell of (a).

As depicted in FIG. 20, the knocking down of the ERG9 results inbuild-up of precursors to squalene synthase. Thus in one aspect, thepresent invention relates to a method for enhancing accumulation of acompound selected from the group consisting of Farnesyl-pyrophosphate,Isopentenyl-pyrophosphate, Dimethylallyl-pyrophosphate,Geranyl-pyrophosphate and Geranylgeranyl-pyrophosphate, said methodcomprising the steps of:

-   -   (a) providing the cell defined herein above, and    -   (b) culturing the cell of (a).

In one embodiment the method of the invention as define herein abovefurther comprises recovering the Farnesyl-pyrophosphate,Isopentenyl-pyrophosphate, Dimethylallyl-pyrophosphate,Geranyl-pyrophosphate or Geranylgeranyl-pyrophosphate compound. Therecovered compound may be used in further processes for producing thedesired terpenoid product compound. The further process may take placein the same cell culture as the process performed and defined hereinabove, such as the accumulation of the terpenoid precursors by the cellof the present invention. Alternatively, the recovered precursors may beadded to another cell culture, or a cell free system, to produce thedesired products.

As the precursors are intermediates, however mainly stableintermediates, a certain endogenous production of terpenoid products mayoccur based on the terpenoid precursor substrates. Also, the cells ofthe invention may have additional genetic modifications such that theyare capable of performing both the accumulation of the terpenoidprecursors (construct of the cell of the invention) and whole orsubstantially the whole subsequent biosynthesis process to the desiredterpenoid product.

Thus, in one embodiment the method of the invention further comprisesrecovering a compound synthesized through the squalene pathway, saidcompound being derived from said Farnesyl-pyrophosphate,Isopentenyl-pyrophosphate, Dimethylallyl-pyrophosphate,Geranyl-pyrophosphate and/or Geranyl geranyl-pyrophosphate. Occasionallyit may be advantageous to include a squalene synthase inhibitor whenculturing the cell of the present invention. Chemical inhibition ofsqualene synthase, e.g. by lapaquistat, is known in the art and is underinvestigation e.g. as a method of lowering cholesterol levels in theprevention of cardiovascular disease. It has also been suggested thatvariants in this enzyme may be part of a genetic association withhypercholesterolemia. Other squalene synthase inhibitors includeZaragozic acid and RPR 107393.

Thus, in one embodiment the culturing step of the method(s) definedherein above is performed in the presence of a squalene synthaseinhibitor.

The cell of the invention may furthermore be genetically modified tofurther enhance production of certain key terpenoid precursors. In oneembodiment the cell is additionally genetically modified to enhanceactivity of and/or overexpress one or more enzymes selected from thegroup consisting of Phosphomevalonate kinase (EC 2.7.4.2),Diphosphomevalonate decarboxylase (EC 4.1.1.33),4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase (EC 1.17.7.1),4-hydroxy-3-methylbut-2-enyl diphosphate reductase (EC 1.17.1.2),Isopentenyl-diphosphate Delta-isomerase 1 (EC 5.3.3.2), Short-chainZ-isoprenyl diphosphate synthase (EC 2.5.1.68), Dimethylallyltransferase(EC 2.5.1.1), Geranyltranstransferase (EC 2.5.1.10) and Geranylgeranylpyrophosphate synthetase (EC 2.5.1.29).

As described herein above in one embodiment of the invention themicrobial cell comprises both a nucleic acid encoding a sqalene synthaseas described herein above as well as a heterologous nucleic acidencoding a GGPPS. Such microbial cells are particularly useful for thepreparation of GGPP as well as terpenoids, wherein GGPP is anintermediate in their biosynthesis.

Accordingly, in one aspect the invention relates to a method forpreparing GGPP, wherein the method comprises the steps of

-   -   a. providing a microbial cell comprising a nucleic acid        sequence, said nucleic acid comprising        -   i) a promoter sequence operably linked to        -   ii) a heterologous insert sequence operably linked to        -   iii) an open reading frame operably linked to        -   iv) a transcription termination signal,        -   wherein the heterologous insert sequence and the open            reading frame are as defined herein above,        -   wherein said microbial cell furthermore comprises a            heterologous nucleic acid encoding GGPPS operably linked to            a nucleic acid sequence directing expression of GGPPS in            said cell;    -   b. Cultivating the microbial cell of a.;    -   c. Recovering the GGPP.

In another aspect the invention relates to a method for preparing aterpenoid of which GGPP is an intermediate in the biosynthesis pathway,wherein the method comprises the steps of

-   -   a. providing a microbial cell, wherein said microbial cell        comprises a nucleic acid sequence, said nucleic acid comprising        -   i) a promoter sequence operably linked to        -   ii) a heterologous insert sequence operably linked to        -   iii) an open reading frame operably linked to        -   iv) a transcription termination signal,        -   wherein the heterologous insert sequence and the open            reading frame are as defined herein above,        -   wherein said microbial cell furthermore comprises a            heterologous nucleic acid encoding GGPPS operably linked to            a nucleic acid sequence directing expression of GGPPS in            said cell;    -   b. Cultivating the microbial cell of a; and    -   c. Recovering the terpenoid,        wherein said terpenoid may be any terpenoid described herein        above in the section “Terpenoids” having GGPP as intermediate in        its biosynthesises; and said microbial cell may be any of the        microbial cells described herein above in the section “The        cell”; and said promoter may be any promoter, such as any of the        promoters described herein above in the section “Promoter”; and        said heterologous insert sequence may be any of the heterologous        insert sequences described herein above in the section        “Heterologous insert sequence”; and said open reading frame        encodes a squalene synthase, which may be any of the squalene        synthases described herein above in the section “Squalene        synthase”; and said GGPPS may be any of the GGPPS described        herein above in the section “Geranylgeranyl Pyrophosphate        Synthase”.

In this embodiment said microbial cell may also optionally contain oneor more additional heterologous nucleic acids encoding one or moreenzymes involved in the biosynthesis pathway of said terpenoid.

In one particular aspect the invention relates to a method for preparingsteviol, wherein the method comprises the steps of

-   -   a. providing a microbial cell, wherein said microbial cell        comprises a nucleic acid sequence, said nucleic acid comprising        -   i) a promoter sequence operably linked to        -   ii) a heterologous insert sequence operably linked to        -   iii) an open reading frame operably linked to        -   iv) a transcription termination signal,        -   wherein the heterologous insert sequence and the open            reading frame are as defined herein above,        -   wherein said microbial cell furthermore comprises a            heterologous nucleic acid encoding GGPPS operably linked to            a nucleic acid sequence directing expression of GGPPS in            said cell;    -   b. Cultivating the microbial cell of a.;    -   c. Recovering steviol,        wherein said microbial cell may be any of the microbial cells        described herein above in the section “The cell”; and said        promoter may be any promoter, such as any of the promoters        described herein above in the section “Promoter”; and said        heterologous insert sequence may be any of the heterologous        insert sequences described herein above in the section        “Heterologous insert sequence”; and said open reading frame        encodes a squalene synthase, which may be any of the squalene        synthases described herein above in the section “Squalene        synthase”; and said GGPPS may be any of the GGPPS described        herein above in the section “Geranylgeranyl Pyrophosphate        Synthase”.

In this embodiment said microbial cell may also optionally contain oneor more additional heterologous nucleic acids encoding one or moreenzymes involved in the biosynthesis pathway of steviol.

In another aspect the invention relates to a method for preparing aterpenoid of which GGPP is an intermediate in the biosynthesis pathway,wherein the method comprises the steps of

-   -   a. providing a microbial cell, wherein said microbial cell        comprises a nucleic acid sequence, said nucleic acid comprising        -   i) a promoter sequence operably linked to        -   ii) a heterologous insert sequence operably linked to        -   iii) an open reading frame operably linked to        -   iv) a transcription termination signal,        -   wherein the heterologous insert sequence and the open            reading frame are as defined herein above,        -   wherein said microbial cell furthermore comprises a            heterologous nucleic acid encoding GGPPS operably linked to            a nucleic acid sequence directing expression of GGPPS in            said cell        -   and wherein said microbial cell furthermore comprises a            heterologous nucleic acid encoding HMCR operably linked to a            nucleic acid sequence directing expression of HMCR in said            cell;    -   b. Cultivating the microbial cell of a.;    -   c. Recovering the terpenoid,        wherein said terpenoid may be any terpenoid described herein        above in the section “Terpenoids” having GGPP as intermediate in        its biosynthesises; and said microbial cell may be any of the        microbial cells described herein above in the section “The        cell”; and said promoter may be any promoter, such as any of the        promoters described herein above in the section “Promoter”; and        said heterologous insert sequence may be any of the heterologous        insert sequences described herein above in the section        “Heterologous insert sequence”; and said open reading frame        encodes a squalene synthase, which may be any of the squalene        synthases described herein above in the section “Squalene        synthase”; and said GGPPS may be any of the GGPPS described        herein above in the section “Geranylgeranyl Pyrophosphate        Synthase”; and said HMCR may be any of the HMCR described herein        above in the section “HMCR”.

In this embodiment said microbial cell may also optionally contain oneor more additional heterologous nucleic acids encoding one or moreenzymes involved in the biosynthesis pathway of said terpenoid.

In one particular aspect the invention relates to a method for preparingsteviol, wherein the method comprises the steps of

-   -   a. providing a microbial cell, wherein said microbial cell        comprises a nucleic acid sequence, said nucleic acid comprising        -   i) a promoter sequence operably linked to        -   ii) a heterologous insert sequence operably linked to        -   iii) an open reading frame operably linked to        -   iv) a transcription termination signal,        -   wherein the heterologous insert sequence and the open            reading frame are as defined herein above,        -   wherein said microbial cell furthermore comprises a            heterologous nucleic acid encoding GGPPS operably linked to            a nucleic acid sequence directing expression of GGPPS in            said cell;    -   b. Cultivating the microbial cell of a.;    -   c. Recovering steviol,        wherein said microbial cell may be any of the microbial cells        described herein above in the section “The cell”; and said        promoter may be any promoter, such as any of the promoters        described herein above in the section “Promoter”; and said        heterologous insert sequence may be any of the heterologous        insert sequences described herein above in the section        “Heterologous insert sequence”; and said open reading frame        encodes a squalene synthase, which may be any of the squalene        synthases described herein above in the section “Squalene        synthase”; and said GGPPS may be any of the GGPPS described        herein above in the section “Geranylgeranyl Pyrophosphate        Synthase” and said HMCR may be any of the HMCR described herein        above in the section “HMCR”.

In this embodiment said microbial cell may also optionally contain oneor more additional heterologous nucleic acids encoding one or moreenzymes involved in the biosynthesis pathway of steviol.

In one embodiment the cell is additionally genetically modified toenhance activity of and/or overexpress one or more enzymes selected fromthe group consisting of acetoacetyl CoA thiolose, HMG-CoA reductase orthe catalytic domain thereof, HMG-CoA synthase, mevalonate kinase,phosphomevalonate kinase, phosphomevalonate decarboxylase, isopentenylpyrophosphate isomerase, farnesyl pyrophosphate synthase,D-1-deoxyxylulose 5-phosphate synthase, and 1-deoxy-D-xylulose5-phosphate reductoisomerase and farnesyl pyrophosphate synthase.

In one embodiment of the method of the present invention, the cellcomprises a mutation in the ERG9 open reading frame.

In another embodiment of the method of the present invention the cellcomprises an ERG9 [Delta]::HIS3 deletion/insertion allele.

In yet another embodiment the step of recovering the compound in themethod of the present invention further comprises purification of saidcompound from the cell culture media.

D. Functional Homologs

Functional homologs of the polypeptides described above are alsosuitable for use in producing steviol or steviol glycosides in arecombinant host. A functional homolog is a polypeptide that hassequence similarity to a reference polypeptide, and that carries out oneor more of the biochemical or physiological function(s) of the referencepolypeptide. A functional homolog and the reference polypeptide may benatural occurring polypeptides, and the sequence similarity may be dueto convergent or divergent evolutionary events. As such, functionalhomologs are sometimes designated in the literature as homologs, ororthologs, or paralogs. Variants of a naturally occurring functionalhomolog, such as polypeptides encoded by mutants of a wild type codingsequence, may themselves be functional homologs. Functional homologs canalso be created via site-directed mutagenesis of the coding sequence fora polypeptide, or by combining domains from the coding sequences fordifferent naturally-occurring polypeptides (“domain swapping”).Techniques for modifying genes encoding functional UGT polypeptidesdescribed herein are known and include, inter alia, directed evolutiontechniques, site-directed mutagenesis techniques and random mutagenesistechniques, and can be useful to increase specific activity of apolypeptide, alter substrate specificity, alter expression levels, altersubcellular location, or modify polypeptide:polypeptide interactions ina desired manner. Such modified polypeptides are considered functionalhomologs. The term “functional homolog” is sometimes applied to thenucleic acid that encodes a functionally homologous polypeptide.

Functional homologs can be identified by analysis of nucleotide andpolypeptide sequence alignments. For example, performing a query on adatabase of nucleotide or polypeptide sequences can identify homologs ofsteviol or steviol glycoside biosynthesis polypeptides. Sequenceanalysis can involve BLAST, Reciprocal BLAST, or PSI-BLAST analysis ofnonredundant databases using a GGPPS, a CDPS, a KS, a KO or a KAH aminoacid sequence as the reference sequence. Amino acid sequence is, in someinstances, deduced from the nucleotide sequence. Those polypeptides inthe database that have greater than 40% sequence identity are candidatesfor further evaluation for suitability as a steviol or steviol glycosidebiosynthesis polypeptide. Amino acid sequence similarity allows forconservative amino acid substitutions, such as substitution of onehydrophobic residue for another or substitution of one polar residue foranother. If desired, manual inspection of such candidates can be carriedout in order to narrow the number of candidates to be further evaluated.Manual inspection can be performed by selecting those candidates thatappear to have domains present in steviol biosynthesis polypeptides,e.g., conserved functional domains.

Conserved regions can be identified by locating a region within theprimary amino acid sequence of a steviol or a steviol glycosidebiosynthesis polypeptide that is a repeated sequence, forms somesecondary structure (e.g., helices and beta sheets), establishespositively or negatively charged domains, or represents a protein motifor domain. See, e.g., the Pfam web site describing consensus sequencesfor a variety of protein motifs and domains on the World Wide Web atsanger.ac.uk/Software/Pfam/ and pfam.janelia.org/. The informationincluded at the Pfam database is described in Sonnhammer et al., Nucl.Acids Res., 26:320-322 (1998); Sonnhammer et al., Proteins, 28:405-420(1997); and Bateman et al., Nucl. Acids Res., 27:260-262 (1999).Conserved regions also can be determined by aligning sequences of thesame or related polypeptides from closely related species. Closelyrelated species preferably are from the same family. In someembodiments, alignment of sequences from two different species isadequate.

Typically, polypeptides that exhibit at least about 40% amino acidsequence identity are useful to identify conserved regions. Conservedregions of related polypeptides exhibit at least 45% amino acid sequenceidentity (e.g., at least 50%, at least 60%, at least 70%, at least 80%,or at least 90% amino acid sequence identity). In some embodiments, aconserved region exhibits at least 92%, 94%, 96%, 98%, or 99% amino acidsequence identity.

For example, polypeptides suitable for producing steviol glycosides in arecombinant host include functional homologs of EUGT11, UGT91D2e,UGT91D2m, UGT85C, and UGT76G. Such homologs have greater than 90% (e.g.,at least 95% or 99%) sequence identity to the amino acid sequence ofEUGT11 (SEQ ID NO: 152), UGT91D2e (SEQ ID NO:5), UGT91D2m (SEQ IDNO:10), UGT85C (SEQ ID NO:3), or UGT76G (SEQ ID NO:7). Variants ofEUGT11, UGT91D2, UGT85C, and UGT76G polypeptides typically have 10 orfewer amino acid substitutions within the primary amino acid sequence,e.g., 7 or fewer amino acid substitutions, 5 or conservative amino acidsubstitutions, or between 1 and 5 substitutions. However, in someembodiments, variants of EUGT11, UGT91D2, UGT85C, and UGT76Gpolypeptides can have 10 or more amino acid substitutions (e.g., 10, 15,20, 25, 30, 35, 10-20, 10-35, 20-30, or 25-35 amino acid substitutions).The substitutions may be conservative, or in some embodiments,non-conservative. Non-limiting examples of non-conservative changes inUGT91D2e polypeptides include glycine to arginine and tryptophan toarginine. Non-limiting examples of non-conservative substitutions inUGT76G polypeptides include valine to glutamic acid, glycine to glutamicacid, glutamine to alanine, and serine to proline. Non-limiting examplesof changes to UGT85C polypeptides include histidine to aspartic acid,proline to serine, lysine to threonine, and threonine to arginine.

In some embodiments, a useful UGT91D2 homolog can have amino acidsubstitutions (e.g., conservative amino acid substitutions) in regionsof the polypeptide that are outside of predicted loops, e.g., residues20-26, 39-43, 88-95, 121-124, 142-158, 185-198, and 203-214 arepredicted loops in the N-terminal domain and residues 381-386 arepredicted loops in the C-terminal domain of SEQ ID NO:5. For example, auseful UGT91D2 homolog can include at least one amino acid substitutionat residues 1-19, 27-38, 44-87, 96-120, 125-141, 159-184, 199-202,215-380, or 387-473 of SEQ ID NO:5. In some embodiments, a UGT91D2homolog can have an amino acid substitution at one or more residuesselected from the group consisting of residues 30, 93, 99, 122, 140,142, 148, 153, 156, 195, 196, 199, 206, 207, 211, 221, 286, 343, 427,and 438 of SEQ ID NO:5. For example, a UGT91D2 functional homolog canhave an amino acid substitution at one or more of residues 206, 207, and343, such as an arginine at residue 206, a cysteine at residue 207, andan arginine at residue 343 of SEQ ID NO:5. See, SEQ ID NO:95. Otherfunctional homologs of UGT91D2 can have one or more of the following: atyrosine or phenylalanine at residue 30, a proline or glutamine atresidue 93, a serine or valine at residue 99, a tyrosine or aphenylalanine at residue 122, a histidine or tyrosine at residue 140, aserine or cysteine at residue 142, an alanine or threonine at residue148, a methionine at residue 152, an alanine at residue 153, an alanineor serine at residue 156, a glycine at residue 162, a leucine ormethionine at residue 195, a glutamic acid at residue 196, a lysine orglutamic acid at residue 199, a leucine or methionine at residue 211, aleucine at residue 213, a serine or phenylalanine at residue 221, avaline or isoleucine at residue 253, a valine or alanine at residue 286,a lysine or asparagine at residue 427, an alanine at residue 438, andeither an alanine or threonine at residue 462 of SEQ ID NO:5. In anotherembodiment, a UGT91D2 functional homolog contains a methionine atresidue 211 and an alanine at residue 286.

In some embodiments, a useful UCT85C homolog can have one or more aminoacid substitutions at residues 9, 10, 13, 15, 21, 27, 60, 65, 71, 87,91, 220, 243, 270, 289, 298, 334, 336, 350, 368, 389, 394, 397, 418,420, 440, 441, 444, and 471 of SEQ ID NO:3. Non-limiting examples ofuseful UGT85C homologs include polypeptides having substitutions (withrespect to SEQ ID NO:3) at residue 65 (e.g., a serine at residue 65), atresidue 65 in combination with residue 15 (a leucine at residue 15), 270(e.g., a methionine, arginine, or alanine at residue 270), 418 (e.g., avaline at residue 418), 440 (e.g., an aspartic acid at residue atresidue 440), or 441 (e.g., an asparagine at residue 441); residues 13(e.g., a phenylalanine at residue 13), 15, 60 (e.g., an aspartic acid atresidue 60), 270, 289 (e.g., a histidine at residue 289), and 418;substitutions at residues 13, 60, and 270; substitutions at residues 60and 87 (e.g., a phenylalanine at residue 87); substitutions at residues65, 71 (e.g., a glutamine at residue 71), 220 (e.g., a threonine atresidue 220), 243 (e.g., a tryptophan at residue 243), and 270;substitutions at residues 65, 71, 220, 243, 270, and 441; substitutionsat residues 65, 71, 220, 389 (e.g., a valine at residue 389), and 394(e.g., a valine at residue 394); substitutions at residues 65, 71, 270,and 289; substitutions at residues 220, 243, 270, and 334 (e.g., aserine at residue 334); or substitutions at residues 270 and 289. Thefollowing amino acid mutations did not result in a loss of activity in85C2 polypeptides: V13F, F15L, H60D, A65S, E71Q, I87F, K220T, R243W,T270M, T270R, Q289H, L334S, A389V, I394V, P397S, E418V, G440D, andH441N. Additional mutations that were seen in active clones include K9E,K10R, Q21H, M27V, L91P, Y298C, K350T, H368R, G420R, L431P, R444G, andM471T. In some embodiments, an UGT85C2 contains substitutions atpositions 65 (e.g., a serine), 71 (a glutamine), 270 (a methionine), 289(a histidine), and 389 (a valine).

The amino acid sequence of Stevia rebaudiana UGTs 74G1, 76G1 and 91D2cwith N-terminal, in-frame fusions of the first 158 amino acids of humanMDM2 protein, and Stevia rebaudiana UGT85C2 with an N-terminal in-framefusion of 4 repeats of the synthetic PMI peptide (4 X TSFAEYWNLLSP, SEQID NO:86) are set forth in SEQ ID NOs: 90, 88, 94, and 92, respectively;see SEQ ID NOs: 89, 92, 93, and 95 for the nucleotide sequences encodingthe fusion proteins.

In some embodiments, a useful UGT76G homolog can have one or more aminoacid substitutions at residues 29, 74, 87, 91, 116, 123, 125, 126, 130,145, 192, 193, 194, 196, 198, 199, 200, 203, 204, 205, 206, 207, 208,266, 273, 274, 284, 285, 291, 330, 331, and 346 of SEQ ID NO:7.Non-limiting examples of useful UGT76G homologs include polypeptideshaving substitutions (with respect to SEQ ID NO:7) at residues 74, 87,91, 116, 123, 125, 126, 130, 145, 192, 193, 194, 196, 198, 199, 200,203, 204, 205, 206, 207, 208, and 291; residues 74, 87, 91, 116, 123,125, 126, 130, 145, 192, 193, 194, 196, 198, 199, 200, 203, 204, 205,206, 207, 208, 266, 273, 274, 284, 285, and 291; or residues 74, 87, 91,116, 123, 125, 126, 130, 145, 192, 193, 194, 196, 198, 199, 200, 203,204, 205, 206, 207, 208, 266, 273, 274, 284, 285, 291, 330, 331, and346. See, Table 9.

TABLE 9 Clone Mutations 76G_G7 M29I, V74E, V87G, L91P, G116E, A123T,Q125A, I126L, T130A, V145M, C192S, S193A, F194Y, M196N, K198Q, K199I,Y200L, Y203I, F204L, E205G, N206K, I207M, T208I, P266Q, S273P, R274S,G284T, T285S, 287-3 bp deletion, L330V, G331A, L346I 76G_H12 M29I, V74E,V87G, L91P, G116E, A123T, Q125A, I126L, T130A, V145M, C192S, S193A,F194Y, M196N, K198Q, K199I, Y200L, Y203I, F204L, E205G, N206K, I207M,T208I, P266Q, S273P, R274S, G284T, T285S, 287-3 bp deletion 76G_C4 M29I,V74E, V87G, L91P, G116E, A123T, Q125A, I126L, T130A, V145M, C192S,S193A, F194Y, M196N, K198Q, K199I, Y200L, Y203I, F204L, E205G, N206K,I207M, T208I

Methods to modify the substrate specificity of, for example, EUGT11 orUGT91D2e, are known to those skilled in the art, and include withoutlimitation site-directed/rational mutagenesis approaches, randomdirected evolution approaches and combinations in which randommutagenesis/saturation techniques are performed near the active site ofthe enzyme. For example see Sarah A. Osmani, et al. Phytochemistry 70(2009) 325-347.

A candidate sequence typically has a length that is from 80 percent to200 percent of the length of the reference sequence, e.g., 82, 85, 87,89, 90, 93, 95, 97, 99, 100, 105, 110, 115, 120, 130, 140, 150, 160,170, 180, 190, or 200 percent of the length of the reference sequence. Afunctional homolog polypeptide typically has a length that is from 95percent to 105 percent of the length of the reference sequence, e.g.,90, 93, 95, 97, 99, 100, 105, 110, 115, or 120 percent of the length ofthe reference sequence, or any range between. A percent identity for anycandidate nucleic acid or polypeptide relative to a reference nucleicacid or polypeptide can be determined as follows. A reference sequence(e.g., a nucleic acid sequence or an amino acid sequence) is aligned toone or more candidate sequences using the computer program ClustalW(version 1.83, default parameters), which allows alignments of nucleicacid or polypeptide sequences to be carried out across their entirelength (global alignment). Chenna et al., Nucleic Acids Res.,31(13):3497-500 (2003).

ClustalW calculates the best match between a reference and one or morecandidate sequences, and aligns them so that identities, similaritiesand differences can be determined. Gaps of one or more residues can beinserted into a reference sequence, a candidate sequence, or both, tomaximize sequence alignments. For fast pairwise alignment of nucleicacid sequences, the following default parameters are used: word size: 2;window size: 4; scoring method: percentage; number of top diagonals: 4;and gap penalty: 5. For multiple alignment of nucleic acid sequences,the following parameters are used: gap opening penalty: 10.0; gapextension penalty: 5.0; and weight transitions: yes. For fast pairwisealignment of protein sequences, the following parameters are used: wordsize: 1; window size: 5; scoring method: percentage; number of topdiagonals: 5; gap penalty: 3. For multiple alignment of proteinsequences, the following parameters are used: weight matrix: blosum; gapopening penalty: 10.0; gap extension penalty: 0.05; hydrophilic gaps:on; hydrophilic residues: Gly, Pro, Ser, Asn, Asp, Gln, Glu, Arg, andLys; residue-specific gap penalties: on. The ClustalW output is asequence alignment that reflects the relationship between sequences.ClustalW can be run, for example, at the Baylor College of MedicineSearch Launcher site on the World Wide Web(searchlauncher.bcm.tmc.edu/multi-align/multi-align.html) and at theEuropean Bioinformatics Institute site on the World Wide Web(ebi.ac.uk/clustalw).

To determine percent identity of a candidate nucleic acid or amino acidsequence to a reference sequence, the sequences are aligned usingClustalW, the number of identical matches in the alignment is divided bythe length of the reference sequence, and the result is multiplied by100. It is noted that the percent identity value can be rounded to thenearest tenth. For example, 78.11, 78.12, 78.13, and 78.14 are roundeddown to 78.1, while 78.15, 78.16, 78.17, 78.18, and 78.19 are rounded upto 78.2.

It will be appreciated that functional UGTs can include additional aminoacids that are not involved in glucosylation or other enzymaticactivities carried out by the enzyme, and thus such a polypeptide can belonger than would otherwise be the case. For example, a EUGT11polypeptide can include a purification tag (e.g., HIS tag or GST tag), achloroplast transit peptide, a mitochondrial transit peptide, anamyloplast peptide, signal peptide, or a secretion tag added to theamino or carboxy terminus. In some embodiments, a EUGT11 polypeptideincludes an amino acid sequence that functions as a reporter, e.g., agreen fluorescent protein or yellow fluorescent protein.

II. STEVIOL AND STEVIOL GLYCOSIDE BIOSYNTHESIS NUCLEIC ACIDS

A recombinant gene encoding a polypeptide described herein comprises thecoding sequence for that polypeptide, operably linked in senseorientation to one or more regulatory regions suitable for expressingthe polypeptide. Because many microorganisms are capable of expressingmultiple gene products from a polycistronic mRNA, multiple polypeptidescan be expressed under the control of a single regulatory region forthose microorganisms, if desired. A coding sequence and a regulatoryregion are considered to be operably linked when the regulatory regionand coding sequence are positioned so that the regulatory region iseffective for regulating transcription or translation of the sequence.Typically, the translation initiation site of the translational readingframe of the coding sequence is positioned between one and about fiftynucleotides downstream of the regulatory region for a monocistronicgene.

In many cases, the coding sequence for a polypeptide described herein isidentified in a species other than the recombinant host, i.e., is aheterologous nucleic acid. Thus, if the recombinant host is amicroorganism, the coding sequence can be from other prokaryotic oreukaryotic microorganisms, from plants or from animals. In some case,however, the coding sequence is a sequence that is native to the hostand is being reintroduced into that organism. A native sequence canoften be distinguished from the naturally occurring sequence by thepresence of non-natural sequences linked to the exogenous nucleic acid,e.g., non-native regulatory sequences flanking a native sequence in arecombinant nucleic acid construct. In addition, stably transformedexogenous nucleic acids typically are integrated at positions other thanthe position where the native sequence is found.

“Regulatory region” refers to a nucleic acid having nucleotide sequencesthat influence transcription or translation initiation and rate, andstability and/or mobility of a transcription or translation product.Regulatory regions include, without limitation, promoter sequences,enhancer sequences, response elements, protein recognition sites,inducible elements, protein binding sequences, 5′ and 3′ untranslatedregions (UTRs), transcriptional start sites, termination sequences,polyadenylation sequences, introns, and combinations thereof. Aregulatory region typically comprises at least a core (basal) promoter.A regulatory region also may include at least one control element, suchas an enhancer sequence, an upstream element or an upstream activationregion (UAR). A regulatory region is operably linked to a codingsequence by positioning the regulatory region and the coding sequence sothat the regulatory region is effective for regulating transcription ortranslation of the sequence. For example, to operably link a codingsequence and a promoter sequence, the translation initiation site of thetranslational reading frame of the coding sequence is typicallypositioned between one and about fifty nucleotides downstream of thepromoter. A regulatory region can, however, be positioned as much asabout 5,000 nucleotides upstream of the translation initiation site, orabout 2,000 nucleotides upstream of the transcription start site.

The choice of regulatory regions to be included depends upon severalfactors, including, but not limited to, efficiency, selectability,inducibility, desired expression level, and preferential expressionduring certain culture stages. It is a routine matter for one of skillin the art to modulate the expression of a coding sequence byappropriately selecting and positioning regulatory regions relative tothe coding sequence. It will be understood that more than one regulatoryregion may be present, e.g., introns, enhancers, upstream activationregions, transcription terminators, and inducible elements.

One or more genes can be combined in a recombinant nucleic acidconstruct in “modules” useful for a discrete aspect of steviol and/orsteviol glycoside production. Combining a plurality of genes in amodule, particularly a polycistronic module, facilitates the use of themodule in a variety of species. For example, a steviol biosynthesis genecluster, or a UGT gene cluster, can be combined in a polycistronicmodule such that, after insertion of a suitable regulatory region, themodule can be introduced into a wide variety of species. As anotherexample, a UGT gene cluster can be combined such that each UGT codingsequence is operably linked to a separate regulatory region, to form aUGT module. Such a module can be used in those species for whichmonocistronic expression is necessary or desirable. In addition to genesuseful for steviol or steviol glycoside production, a recombinantconstruct typically also contains an origin of replication, and one ormore selectable markers for maintenance of the construct in appropriatespecies.

It will be appreciated that because of the degeneracy of the geneticcode, a number of nucleic acids can encode a particular polypeptide;i.e., for many amino acids, there is more than one nucleotide tripletthat serves as the codon for the amino acid. Thus, codons in the codingsequence for a given polypeptide can be modified such that optimalexpression in a particular host is obtained, using appropriate codonbias tables for that host (e.g., microorganism). SEQ ID NOs:18-25,34-36, 40-43, 48-49, 52-55, 60-64, 70-72, and 154 set forth nucleotidesequences encoding certain enzymes for steviol and steviol glycosidebiosynthesis, modified for increased expression in yeast. As isolatednucleic acids, these modified sequences can exist as purified moleculesand can be incorporated into a vector or a virus for use in constructingmodules for recombinant nucleic acid constructs.

In some cases, it is desirable to inhibit one or more functions of anendogenous polypeptide in order to divert metabolic intermediatestowards steviol or steviol glycoside biosynthesis. For example, it maybe desirable to downregulate synthesis of sterols in a yeast strain inorder to further increase steviol or steviol glycoside production, e.g.,by downregulating squalene epoxidase. As another example, it may bedesirable to inhibit degradative functions of certain endogenous geneproducts, e.g., glycohydrolases that remove glucose moieties fromsecondary metabolites or phosphatases as discussed herein. As anotherexample, expression of membrane transporters involved in transport ofsteviol glycosides can be inhibited, such that secretion of glycosylatedsteviosides is inhibited. Such regulation can be beneficial in thatsecretion of steviol glycosides can be inhibited for a desired period oftime during culture of the microorganism, thereby increasing the yieldof glycoside product(s) at harvest. In such cases, a nucleic acid thatinhibits expression of the polypeptide or gene product may be includedin a recombinant construct that is transformed into the strain.Alternatively, mutagenesis can be used to generate mutants in genes forwhich it is desired to inhibit function.

III. HOSTS A. Microorganisms

A number of prokaryotes and eukaryotes are suitable for use inconstructing the recombinant microorganisms described herein, e.g.,gram-negative bacteria, yeast and fungi. A species and strain selectedfor use as a steviol or steviol glycoside production strain is firstanalyzed to determine which production genes are endogenous to thestrain and which genes are not present. Genes for which an endogenouscounterpart is not present in the strain are assembled in one or morerecombinant constructs, which are then transformed into the strain inorder to supply the missing function(s).

Exemplary prokaryotic and eukaryotic species are described in moredetail below. However, it will be appreciated that other species may besuitable. For example, suitable species may be in a genus selected fromthe group consisting of Agaricus, Aspergillus, Bacillus, Candida,Corynebacterium, Escherichia, Fusarium/Gibberella, Kluyveromyces,Laetiporus, Lentinus, Phaffia, Phanerochaete, Pichia, Physcomitrella,Rhodoturula, Saccharomyces, Schizosaccharomyces, Sphaceloma,Xanthophyllomyces and Yarrowia. Exemplary species from such generainclude Lentinus tigrinus, Laetiporus sulphureus, Phanerochaetechrysosporium, Pichia pastoris, Physcomitrella patens, Rhodoturulaglutinis 32, Rhodoturula mucilaginosa, Phaffia rhodozyma UBV-AX,Xanthophyllomyces dendrorhous, Fusarium fujikuroi/Gibberella fujikuroi,Candida utilis and Yarrowia lipolytica. In some embodiments, amicroorganism can be an Ascomycete such as Gibberella fujikuroi,Kluyveromyces lactis, Schizosaccharomyces pombe, Aspergillus niger, orSaccharomyces cerevisiae. In some embodiments, a microorganism can be aprokaryote such as Escherichia coli, Rhodobacter sphaeroides, orRhodobacter capsulatus. It will be appreciated that certainmicroorganisms can be used to screen and test genes of interest in ahigh throughput manner, while other microorganisms with desiredproductivity or growth characteristics can be used for large-scaleproduction of steviol glycosides.

Saccharomyces cerevisiae

Saccharomyces cerevisiae is a widely used chassis organism in syntheticbiology, and can be used as the recombinant microorganism platform.There are libraries of mutants, plasmids, detailed computer models ofmetabolism and other information available for S. cerevisiae, allowingfor rational design of various modules to enhance product yield. Methodsare known for making recombinant microorganisms.

A steviol biosynthesis gene cluster can be expressed in yeast using anyof a number of known promoters. Strains that overproduce terpenes areknown and can be used to increase the amount of geranylgeranyldiphosphate available for steviol and steviol glycoside production.

Aspergillus spp.

Aspergillus species such as A. oryzae, A. niger and A. sojae are widelyused microorganisms in food production, and can also be used as therecombinant microorganism platform. Nucleotide sequences are availablefor genomes of A. nidulans, A. fumigatus, A. orvzae, A. clavatus, A.flavus, A. niger, and A. terreus, allowing rational design andmodification of endogenous pathways to enhance flux and increase productyield. Metabolic models have been developed for Aspergillus, as well astranscriptomic studies and protcomics studies. A. niger is cultured forthe industrial production of a number of food ingredients such as citricacid and gluconic acid, and thus species such as A. niger are generallysuitable for the production of food ingredients such as steviol andsteviol glycosides.

Escherichia coli

Escherichia coli, another widely used platform organism in syntheticbiology, can also be used as the recombinant microorganism platform.Similar to Saccharomyces, there are libraries of mutants, plasmids,detailed computer models of metabolism and other information availablefor E. coli, allowing for rational design of various modules to enhanceproduct yield. Methods similar to those described above forSaccharomyces can be used to make recombinant E. coli microorganisms.

Agaricus, Gibberella, and Phanerochaete spp.

Agaricus, Gibberella, and Phanerochaete spp. can be useful because theyare known to produce large amounts of gibberellin in culture. Thus, theterpene precursors for producing large amounts of steviol and steviolglycosides are already produced by endogenous genes. Thus, modulescontaining recombinant genes for steviol or steviol glycosidebiosynthesis polypeptides can be introduced into species from suchgenera without the necessity of introducing mevalonate or MEP pathwaygenes.

Arxula adeninivorans (Blastobotrys adeninivorans)

Arxula adeninivorans is a dimorphic yeast (it grows as a budding yeastlike the baker's yeast up to a temperature of 42° C., above thisthreshold it grows in a filamentous form) with unusual biochemicalcharacteristics. It can grow on a wide range of substrates and canassimilate nitrate. It has successfully been applied to the generationof strains that can produce natural plastics or the development of abiosensor for estrogens in environmental samples.

Yarrowia lipolytica

Yarrowia lipolytica is a dimorphic yeast (see Arxula adeninivorans) thatcan grow on a wide range of substrates. It has a high potential forindustrial applications but there are no recombinant productscommercially available yet.

Rhodobacter spp.

Rhodobacter can be use as the recombinant microorganism platform.Similar to E. coli, there are libraries of mutants available as well assuitable plasmid vectors, allowing for rational design of variousmodules to enhance product yield. Isoprenoid pathways have beenengineered in membraneous bacterial species of Rhodobacter for increasedproduction of carotenoid and CoQ10. See, U.S. Patent Publication Nos.20050003474 and 20040078846. Methods similar to those described abovefor E. coli can be used to make recombinant Rhodobacter microorganisms.

Candida boidinii

Candida boidinii is a methylotrophic yeast (it can grow on methanol).Like other methylotrophic species such as Hansenula polymorpha andPichia pastoris, it provides an excellent platform for the production ofheterologous proteins. Yields in a multigram range of a secreted foreignprotein have been reported. A computational method, IPRO, recentlypredicted mutations that experimentally switched the cofactorspecificity of Candida boidinii xylose reductase from NADPH to NADH.

Hansenula polymorpha (Pichia angusta)

Hansenula polymorpha is another methylotrophic yeast (see Candidaboidinii). It can furthermore grow on a wide range of other substrates;it is thermo-tolerant and can assimilate nitrate (see also Kluyveromyceslactis). It has been applied to the production of hepatitis B vaccines,insulin and interferon alpha-2a for the treatment of hepatitis C,furthermore to a range of technical enzymes.

Kluyveromyces lactis

Kluyveromyces lactis is a yeast regularly applied to the production ofkefir. It can grow on several sugars, most importantly on lactose whichis present in milk and whey. It has successfully been applied amongothers to the production of chymosin (an enzyme that is usually presentin the stomach of calves) for the production of cheese. Production takesplace in fermenters on a 40,000 L scale.

Pichia pastoris

Pichia pastoris is a methylotrophic yeast (see Candida boidinii andHansenula polymorpha). It provides an efficient platform for theproduction of foreign proteins. Platform elements are available as a kitand it is worldwide used in academia for the production of proteins.Strains have been engineered that can produce complex human N-glycan(yeast glycans are similar but not identical to those found in humans).

Physcomitrella spy.

Physcomitrella mosses, when grown in suspension culture, havecharacteristics similar to yeast or other fungal cultures. This generais becoming an important type of cell for production of plant secondarymetabolites, which can be difficult to produce in other types of cells.

B. Plant Cells or Plants

In some embodiments, the nucleic acids and polypeptides described hereinare introduced into plants or plant cells to increase overall steviolglycoside production or enrich for the production of specific steviolglycosides in proportion to others. Thus, a host can be a plant or aplant cell that includes at least one recombinant gene described herein.A plant or plant cell can be transformed by having a recombinant geneintegrated into its genome, i.e., can be stably transformed. Stablytransformed cells typically retain the introduced nucleic acid with eachcell division. A plant or plant cell can also be transiently transformedsuch that the recombinant gene is not integrated into its genome.Transiently transformed cells typically lose all or some portion of theintroduced nucleic acid with each cell division such that the introducednucleic acid cannot be detected in daughter cells after a sufficientnumber of cell divisions. Both transiently transformed and stablytransformed transgenic plants and plant cells can be useful in themethods described herein.

Transgenic plant cells used in methods described herein can constitutepart or all of a whole plant. Such plants can be grown in a mannersuitable for the species under consideration, either in a growthchamber, a greenhouse, or in a field. Transgenic plants can be bred asdesired for a particular purpose, e.g., to introduce a recombinantnucleic acid into other lines, to transfer a recombinant nucleic acid toother species, or for further selection of other desirable traits.Alternatively, transgenic plants can be propagated vegetatively forthose species amenable to such techniques. As used herein, a transgenicplant also refers to progeny of an initial transgenic plant provided theprogeny inherits the transgene. Seeds produced by a transgenic plant canbe grown and then selfed (or outcrossed and selfed) to obtain seedshomozygous for the nucleic acid construct.

Transgenic plants can be grown in suspension culture, or tissue or organculture. For the purposes of this invention, solid and/or liquid tissueculture techniques can be used. When using solid medium, transgenicplant cells can be placed directly onto the medium or can be placed ontoa filter that is then placed in contact with the medium. When usingliquid medium, transgenic plant cells can be placed onto a flotationdevice, e.g., a porous membrane that contacts the liquid medium.

When transiently transformed plant cells are used, a reporter sequenceencoding a reporter polypeptide having a reporter activity can beincluded in the transformation procedure and an assay for reporteractivity or expression can be performed at a suitable time aftertransformation. A suitable time for conducting the assay typically isabout 1-21 days after transformation, e.g., about 1-14 days, about 1-7days, or about 1-3 days. The use of transient assays is particularlyconvenient for rapid analysis in different species, or to confirmexpression of a heterologous polypeptide whose expression has notpreviously been confirmed in particular recipient cells.

Techniques for introducing nucleic acids into monocotyledonous anddicotyledonous plants are known in the art, and include, withoutlimitation, Agrobacterium-mediated transformation, viral vector-mediatedtransformation, electroporation and particle gun transformation, U.S.Pat. Nos. 5,538,880; 5,204,253; 6,329,571; and 6,013,863. If a cell orcultured tissue is used as the recipient tissue for transformation,plants can be regenerated from transformed cultures if desired, bytechniques known to those skilled in the art.

A population of transgenic plants can be screened and/or selected forthose members of the population that have a trait or phenotype conferredby expression of the transgene. For example, a population of progeny ofa single transformation event can be screened for those plants having adesired level of expression of a steviol or steviol glycosidebiosynthesis polypeptide or nucleic acid. Physical and biochemicalmethods can be used to identify expression levels. These includeSouthern analysis or PCR amplification for detection of apolynucleotide; Northern blots, S1 RNase protection, primer-extension,or RT-PCR amplification for detecting RNA transcripts; enzymatic assaysfor detecting enzyme or ribozyme activity of polypeptides andpolynucleotides; and protein gel electrophoresis, Western blots,immunoprecipitation, and enzyme-linked immunoassays to detectpolypeptides. Other techniques such as in situ hybridization, enzymestaining, and immunostaining also can be used to detect the presence orexpression of polypeptides and/or nucleic acids. Methods for performingall of the referenced techniques are known. As an alternative, apopulation of plants comprising independent transformation events can bescreened for those plants having a desired trait, such as production ofa steviol glycoside or modulated biosynthesis of a steviol glycoside.Selection and/or screening can be carried out over one or moregenerations, and/or in more than one geographic location. In some cases,transgenic plants can be grown and selected under conditions whichinduce a desired phenotype or are otherwise necessary to produce adesired phenotype in a transgenic plant. In addition, selection and/orscreening can be applied during a particular developmental stage inwhich the phenotype is expected to be exhibited by the plant. Selectionand/or screening can be carried out to choose those transgenic plantshaving a statistically significant difference in a steviol glycosidelevel relative to a control plant that lacks the transgene.

The nucleic acids, recombinant genes, and constructs described hereincan be used to transform a number of monocotyledonous and dicotyledonousplants and plant cell systems. Non-limiting examples of suitablemonocots include, for example, cereal crops such as rice, rye, sorghum,millet, wheat, maize, and barley. The plant may be a non-cereal monocotsuch as asparagus, banana, or onion. The plant also may be a dicot suchas stevia (Stevia rebaudiana), soybean, cotton, sunflower, pea,geranium, spinach, or tobacco. In some cases, the plant may contain theprecursor pathways for phenyl phosphate production such as themevalonate pathway, typically found in the cytoplasm and mitochondria.The non-mevalonate pathway is more often found in plant plastids [Dubey,et al., 2003 J. Biosci. 28 637-646]. One with skill in the art maytarget expression of steviol glycoside biosynthesis polypeptides to theappropriate organelle through the use of leader sequences, such thatsteviol glycoside biosynthesis occurs in the desired location of theplant cell. One with skill in the art will use appropriate promoters todirect synthesis, e.g., to the leaf of a plant, if so desired.Expression may also occur in tissue cultures such as callus culture orhairy root culture, if so desired.

In one embodiment, one or more nucleic acid or polypeptides describedherein are introduced into Stevia (e.g., Stevia rebaudiana) such thatoverall steviol glycoside biosynthesis is increased or that the overallsteviol glycoside composition is selectively enriched for one or morespecific steviol glycosides (e.g., rebaudioside D). For example, one ormore recombinant genes can be introduced into Stevia such that a EUGT11enzyme (e.g., SEQ ID NO: 152 or a functional homolog thereof) isexpressed alone or in combination with one or more of: a UGT91D enzymesuch as UGT91D2e (e.g., SEQ ID NO:5 or a functional homolog thereof),UGT91D2m (e.g., SEQ ID NO: 10); a UGT85C enzyme such as a variantdescribed in the “Functional Homolog” section, a UGT76G1 enzyme such asa variant described in the “Functional Homolog” section, or a UGT74G1enzyme. Nucleic acid constructs typically include a suitable promoter(e.g., 35S, e35S, or ssRUBISCO promoters) operably linked to a nucleicacid encoding the UGT polypeptide. Nucleic acids can be introduced intoStevia by Agrobacterium-mediated transformation;electroporation-mediated gene transfer to protoplasts; or by particlebombardment. See, e.g., Singh, et al., Compendium of Transgenic CropPlants: Transgenic Sugar, Tuber and Fiber, Edited by Chittaranjan Koleand Timothy C. Hall, Blackwell Publishing Ltd. (2008), pp. 97-115. Forparticle bombardment of stevia leaf derived callus, the parameters canbe as follows: 6 cm distance, 1100 psi He pressure, gold particles, andone bombardment.

Stevia plants can be regenerated by somatic embryogenesis as describedby Singh et al., 2008, supra. In particular, leaf segments(approximately 1-2 cm long) can be removed from 5 to 6-week-old in vitroraised plants and incubated (adaxial side down) on MS mediumsupplemented with B5 vitamins, 30 g sucrose and 3 g Gelrite.2,4-dichlorophenoxyacetic acid (2,4-D) can be used in combination with6-benzyl adenine (BA), kinetin (KN), or zeatin. Proembryogenic massesappear after 8 weeks of subculture. Within 2-3 weeks of subcultures,somatic embryos will appear on the surface of cultures. Embryos can bematured in medium containing BA in combination with 2,4-D,a-naphthaleneacetic acid (NAA), or indolbutyric acid (IBA). Maturesomatic embryos that germinate and form plantlets can be excised fromcalli. After plantlets reach 3-4 weeks, the plantlets can be transferredto pots with vermiculite and grown for 6-8 weeks in growth chambers foracclimatization and transferred to greenhouses.

In one embodiment, steviol glycosides are produced in rice. Rice andmaize are readily transformable using techniques such asAgrobacterium-mediated transformation. Binary vector systems arecommonly utilized for Agrobacterium exogenous gene introduction tomonocots. See, for example, U.S. Pat. Nos. 6,215,051 and 6,329,571. In abinary vector system, one vector contains the T-DNA region, whichincludes a gene of interest (e.g., a UGT described herein) and the othervector is a disarmed Ti plasmid containing the vir region. Co-integratedvectors and mobilizable vectors also can be used. The types andpretreatment of tissues to be transformed, the strain of Agrobacteriumused, the duration of the inoculation, the prevention of overgrowth andnecrosis by the Agrobacterium, can be readily adjusted by one of skillin the art. Immature embryo cells of rice can be prepared fortransformation with Agrobacterium using binary vectors. The culturemedium used is supplemented with phenolic compounds. Alternatively, thetransformation can be done in planta using vacuum infiltration. See, forexample, WO 2000037663, WO 2000063400, and WO 2001012828.

IV. METHODS OF PRODUCING STEVIOL GLYCOSIDES

Recombinant hosts described herein can be used in methods to producesteviol or steviol glycosides. For example, if the recombinant host is amicroorganism, the method can include growing the recombinantmicroorganism in a culture medium under conditions in which stevioland/or steviol glycoside biosynthesis genes are expressed. Therecombinant microorganism may be grown in a fed batch or continuousprocess. Typically, the recombinant microorganism is grown in afermentor at a defined temperature(s) for a desired period of time.Depending on the particular microorganism used in the method, otherrecombinant genes such as isopentenyl biosynthesis genes and terpenesynthase and cyclase genes may also be present and expressed. Levels ofsubstrates and intermediates, e.g., isopentenyl diphosphate,dimethylallyl diphosphate, geranylgeranyl diphosphate, kaurene andkaurenoic acid, can be determined by extracting samples from culturemedia for analysis according to published methods.

After the recombinant microorganism has been grown in culture for thedesired period of time, steviol and/or one or more steviol glycosidescan then be recovered from the culture using various techniques known inthe art. In some embodiments, a permeabilizing agent can be added to aidthe feedstock entering into the host and product getting out. If therecombinant host is a plant or plant cells, steviol or steviolglycosides can be extracted from the plant tissue using varioustechniques known in the art. For example, a crude lysate of the culturedmicroorganism or plant tissue can be centrifuged to obtain asupernatant. The resulting supernatant can then be applied to achromatography column, e.g., a C-18 column, and washed with water toremove hydrophilic compounds, followed by elution of the compound(s) ofinterest with a solvent such as methanol. The compound(s) can then befurther purified by preparative HPLC. See also WO 2009/140394.

The amount of steviol glycoside (e.g., rebaudioside D) produced can befrom about 1 mg/L to about 1500 mg/L, e.g., about 1 to about 10 mg/L,about 3 to about 10 mg/L, about 5 to about 20 mg/L, about 10 to about 50mg/L, about 10 to about 100 mg/L, about 25 to about 500 mg/L, about 100to about 1,500 mg/L, or about 200 to about 1,000 mg/L. In general,longer culture times will lead to greater amounts of product. Thus, therecombinant microorganism can be cultured for from 1 day to 7 days, from1 day to 5 days, from 3 days to 5 days, about 3 days, about 4 days, orabout 5 days.

It will be appreciated that the various genes and modules discussedherein can be present in two or more recombinant microorganisms ratherthan a single microorganism. When a plurality of recombinantmicroorganisms is used, they can be grown in a mixed culture to producesteviol and/or steviol glycosides. For example, a first microorganismcan comprise one or more biosynthesis genes for producing steviol whilea second microorganism comprises steviol glycoside biosynthesis genes.Alternatively, the two or more microorganisms each can be grown in aseparate culture medium and the product of the first culture medium,e.g., steviol, can be introduced into second culture medium to beconverted into a subsequent intermediate, or into an end product such asrebaudioside A. The product produced by the second, or finalmicroorganism is then recovered. It will also be appreciated that insome embodiments, a recombinant microorganism is grown using nutrientsources other than a culture medium and utilizing a system other than afermentor.

Steviol glycosides do not necessarily have equivalent performance indifferent food systems. It is therefore desirable to have the ability todirect the synthesis to steviol glycoside compositions of choice.Recombinant hosts described herein can produce compositions that areselectively enriched for specific steviol glycosides (e.g., rebaudiosideD) and have a consistent taste profile. Thus, the recombinantmicroorganisms, plants, and plant cells described herein can facilitatethe production of compositions that are tailored to meet the sweeteningprofile desired for a given food product and that have a proportion ofeach steviol glycoside that is consistent from batch to batch.Microorganisms described herein do not produce the undesired plantbyproducts found in Stevia extracts. Thus, steviol glycosidecompositions produced by the recombinant microorganisms described hereinare distinguishable from compositions derived from Stevia plants.

V. FOOD PRODUCTS

The steviol glycosides obtained by the methods disclosed herein can beused to make food products, dietary supplements and sweetenercompositions. For example, substantially pure steviol or steviolglycoside such as rebaudioside A or rebaudioside D can be included infood products such as ice cream, carbonated beverages, fruit juices,yogurts, baked goods, chewing gums, hard and soft candies, and sauces.Substantially pure steviol or steviol glycoside can also be included innon-food products such as pharmaceutical products, medicinal products,dietary supplements and nutritional supplements. Substantially puresteviol or steviol glycosides may also be included in animal feedproducts for both the agriculture industry and the companion animalindustry. Alternatively, a mixture of steviol and/or steviol glycosidescan be made by culturing recombinant microorganisms separately orgrowing different plants/plant cells, each producing a specific steviolor steviol glycoside, recovering the steviol or steviol glycoside insubstantially pure form from each microorganism or plant/plant cells andthen combining the compounds to obtain a mixture containing eachcompound in the desired proportion. The recombinant microorganisms,plants, and plant cells described herein permit more precise andconsistent mixtures to be obtained compared to current Stevia products.In another alternative, a substantially pure steviol or steviolglycoside can be incorporated into a food product along with othersweeteners, e.g. saccharin, dextrose, sucrose, fructose, erythritol,aspartame, sucralose, monatin, or acesulfame potassium. The weight ratioof steviol or steviol glycoside relative to other sweeteners can bevaried as desired to achieve a satisfactory taste in the final foodproduct. See, e.g., U.S. Patent Publication No. 2007/0128311. In someembodiments, the steviol or steviol glycoside may be provided with aflavor (e.g., citrus) as a flavor modulator. For example, Rebaudioside Ccan be used as a sweetness enhancer or sweetness modulator, inparticular for carbohydrate based sweeteners, such that the amount ofsugar can be reduced in the food product.

Compositions produced by a recombinant microorganism, plant, or plantcell described herein can be incorporated into food products. Forexample, a steviol glycoside composition produced by a recombinantmicroorganism, plant, or plant cell can be incorporated into a foodproduct in an amount ranging from about 20 mg steviol glycoside/kg foodproduct to about 1800 mg steviol glycoside/kg food product on a dryweight basis, depending on the type of steviol glycoside and foodproduct. For example, a steviol glycoside composition produced by arecombinant microorganism, plant, or plant cell can be incorporated intoa dessert, cold confectionary (e.g., ice cream), dairy product (e.g.,yogurt), or beverage (e.g., a carbonated beverage) such that the foodproduct has a maximum of 500 mg steviol glycoside/kg food on a dryweight basis. A steviol glycoside composition produced by a recombinantmicroorganism, plant, or plant cell can be incorporated into a bakedgood (e.g., a biscuit) such that the food product has a maximum of 300mg steviol glycoside/kg food on a dry weight basis. A steviol glycosidecomposition produced by a recombinant microorganism, plant, or plantcell can be incorporated into a sauce (e.g., chocolate syrup) orvegetable product (e.g., pickles) such that the food product has amaximum of 1000 mg steviol glycoside/kg food on a dry weight basis. Asteviol glycoside composition produced by a recombinant microorganism,plant, or plant cell can be incorporated into a bread such that the foodproduct has a maximum of 160 mg steviol glycoside/kg food on a dryweight basis. A steviol glycoside composition produced by a recombinantmicroorganism, plant, or plant cell can be incorporated into a hard orsoft candy such that the food product has a maximum of 1600 mg steviolglycoside/kg food on a dry weight basis. A steviol glycoside compositionproduced by a recombinant microorganism, plant, or plant cell can beincorporated into a processed fruit product (e.g., fruit juices, fruitfilling, jams, and jellies) such that the food product has a maximum of1000 mg steviol glycoside/kg food on a dry weight basis.

For example, such a steviol glycoside composition can have from 90-99%rebaudioside A and an undetectable amount of stevia plant-derivedcontaminants, and be incorporated into a food product at from 25-1600mg/kg, e.g., 100-500 mg/kg, 25-100 mg/kg, 250-1000 mg/kg, 50-500 mg/kgor 500-1000 mg/kg on a dry weight basis.

Such a steviol glycoside composition can be a rebaudioside B-enrichedcomposition having greater than 3% rebaudioside B and be incorporatedinto the food product such that the amount of rebaudioside B in theproduct is from 25-1600 mg/kg, e.g., 100-500 mg/kg, 25-100 mg/kg,250-1000 mg/kg, 50-500 mg/kg or 500-1000 mg/kg on a dry weight basis.Typically, the rebaudioside B-enriched composition has an undetectableamount of stevia plant-derived contaminants.

Such a steviol glycoside composition can be a rebaudioside C-enrichedcomposition having greater than 15% rebaudioside C and be incorporatedinto the food product such that the amount of rebaudioside C in theproduct is from 20-600 mg/kg, e.g., 100-600 mg/kg, 20-100 mg/kg, 20-95mg/kg, 20-250 mg/kg, 50-75 mg/kg or 50-95 mg/kg on a dry weight basis.Typically, the rebaudioside C-enriched composition has an undetectableamount of stevia plant-derived contaminants.

Such a steviol glycoside composition can be a rebaudioside D-enrichedcomposition having greater than 3% rebaudioside D and be incorporatedinto the food product such that the amount of rebaudioside D in theproduct is from 25-1600 mg/kg, e.g., 100-500 mg/kg, 25-100 mg/kg,250-1000 mg/kg, 50-500 mg/kg or 500-1000 mg/kg on a dry weight basis.Typically, the rebaudioside D-enriched composition has an undetectableamount of stevia plant-derived contaminants.

Such a steviol glycoside composition can be a rebaudioside E-enrichedcomposition having greater than 3% rebaudioside E and be incorporatedinto the food product such that the amount of rebaudioside E in theproduct is from 25-1600 mg/kg, e.g., 100-500 mg/kg, 25-100 mg/kg,250-1000 mg/kg, 50-500 mg/kg or 500-1000 mg/kg on a dry weight basis.Typically, the rebaudioside E-enriched composition has an undetectableamount of stevia plant-derived contaminants.

Such a steviol glycoside composition can be a rebaudioside F-enrichedcomposition having greater than 4% rebaudioside F and be incorporatedinto the food product such that the amount of rebaudioside F in theproduct is from 25-1000 mg/kg, e.g., 100-600 mg/kg, 25-100 mg/kg, 25-95mg/kg, 50-75 mg/kg or 50-95 mg/kg on a dry weight basis. Typically, therebaudioside F-enriched composition has an undetectable amount of steviaplant-derived contaminants.

Such a steviol glycoside composition can be a dulcoside A-enrichedcomposition having greater than 4% dulcoside A and be incorporated intothe food product such that the amount of dulcoside A in the product isfrom 25-1000 mg/kg, e.g., 100-600 mg/kg, 25-100 mg/kg, 25-95 mg/kg,50-75 mg/kg or 50-95 mg/kg on a dry weight basis. Typically, thedulcoside A-enriched composition has an undetectable amount of steviaplant-derived contaminants.

Such a steviol glycoside composition can be a composition enriched forrubusoside xylosylated on either of the two positions—the 13-O-glucoseor the 19-O-glucose. Such a composition can have greater than 4% of thexylosylated rubusoside compound, and can be incorporated into the foodproduct such that the amount of xylosylated rubusoside compound in theproduct is from 25-1000 mg/kg, e.g., 100-600 mg/kg, 25-100 mg/kg, 25-95mg/kg, 50-75 mg/kg or 50-95 mg/kg on a dry weight basis. Typically, thexylosylated rubusoside enriched composition has an undetectable amountof stevia plant-derived contaminants.

Such a steviol glycoside composition can be a composition enriched forcompounds rhamnosylated on either of the two positions—the 13-O-glucoseor the 19-O-glucose, or compounds containing one rhamnose and multipleglucoses (e.g., steviol 13-O-1,3-diglycoside-1,2-rhamnoside). Such acomposition can have greater than 4% of the rhamnosylated compound, andcan be incorporated into the food product such that the amount ofrhamnosylated compound in the product is from 25-1000 mg/kg, e.g.,100-600 mg/kg, 25-100 mg/kg, 25-95 mg/kg, 50-75 mg/kg or 50-95 mg/kg ona dry weight basis. Typically, the composition enriched forrhamnosylated compounds has as an undetectable amount of steviaplant-derived contaminants.

In some embodiments, a substantially pure steviol or steviol glycosideis incorporated into a tabletop sweetener or “cup-for-cup” product. Suchproducts typically are diluted to the appropriate sweetness level withone or more bulking agents, e.g., maltodextrins, known to those skilledin the art. Steviol glycoside compositions enriched for rebaudioside A,rebaudioside C, rebaudioside D, rebaudioside E, rebaudioside F,dulcoside A, or rhamnosylated or xylosylated compounds, can be packagein a sachet, for example, at from 10,000 to 30,000 mg steviolglycoside/kg product on a dry weight basis, for tabletop use.

VI. EXAMPLES

The invention will be further described in the following examples, whichdo not limit the scope of the invention described in the claims. In theexamples described herein, the following LC-MS methodology was used toanalyze steviol glycosides and steviol pathway intermediates unlessotherwise indicated.

1) Analyses of steviol glycosides

LC-MS analyses were performed using an Agilent 1200 Series HPLC system(Agilent Technologies, Wilmington, Del., USA) fitted with a Phenomenex®kinetex C18 column (150×2.1 mm, 2.6 μm particles, 100 A pore size)connected to a TSQ Quantum Access (ThermoFisher Scientific) triplequadropole mass spectrometer with a heated electrospray ion (HESI)source. Elution was carried out using a mobile phase of eluent B (MeCNwith 0.1% Formic acid) and eluent A (water with 0.1% Formic acid) byincreasing the gradient from 10->40% B from min 0.0 to 1.0, increasing40->50% B in min 1.0 to 6.5, 50->100% B from min 6.5 to 7.0 and finallywashing and re-equilibration. The flow rate was 0.4 ml/min and thecolumn temperature 30° C. The steviol glycosides were detected using SIM(Single Ion Monitoring) in positive mode with the following m/z-traces.

compound (typical Description Exact Mass m/z trace t_(R) in min)Steviol + [M + H]⁺ 481.2 ± 0.5 19-SMG (6.1), 1 Glucose 481.2796 13-SMG(6.4) [M + Na]⁺ 503.1 ± 0.5 503.2615 Steviol + [M + Na]⁺  665 ± 0.5Rubusoside (4.7) 2 Glucose 665.3149 Steviol-1,2-bioside (5.2)Steviol-1,3-bioside (5.8) Steviol + [M + Na]⁺ 827.4 ± 0.5 1,2-Stevioside(4.0) 3 Glucose 827.3677 1,3-Stevioside (4.4) Rebaudioside B (5.0)Steviol + [M + Na]⁺ 989.4 ± 0.5 Rebaudioside A (3.9) 4 Glucose 989.4200Steviol + [M + Na]⁺ 1151.4 ± 0.5  Rebaudioside D (3.3) 5 Glucose1151.4728

The level of steviol glycosides were quantified by comparing withcalibration curves obtained with authentic standards from LGC Standards.For example, standard solutions of 0.5 to 100 μM Rebaudioside A weretypically utilized to construct a calibration curve.

2) Analyses of Steviol and ent-kaurenoic acid

LC-MS analyses of steviol and ent-kaurenoic acid were performed on thesystem described above. For the separation, a Thermo Science HypersilGold (C-18, 3 μm, 100×2.1 mm) column was used and a 20 mM ammoniumacetate aqueous solution was used as eluent A and acetonitrile as eluentB. The gradient conditions were: 20->55% B in min 0.0 to 1.0, 55->100 inmin 1.0-7.0 and finally washing and re-equilibration. The flow rate was0.5 mL/min and the column temperature 30° C. Steviol and ent-kaurenoicacid were detected using SIM (Single Ion Monitoring) in negative modewith the following m/z-traces.

typical t_(R) Description Exact Mass m/z trace in min Steviol [M − H]⁻317.2122 317.4 ± 0.5 3.3 Ent-kaurenoic acid [M − H]⁻ 301.2173 301.4 ±0.5 5.5

3) HPLC quantification of UDP-Glucose

For the quantification of UDP-glucose, an Agilent 1200 Series HPLCsystem was used, with a Waters XBridge BEH amide (2.5 μm, 3.0×50 mm)column. Eluent A was a 10 mM ammonium acetate aqueous solution (pH 9.0)and Eluent B acetonitrile. The gradient conditions were: 95% B holdingfrom min 0.0-0.5, decreasing from 95-50% B in min 0.5-4.5, holding 50% Bfrom min 4.5-6.8 and finally re-equilibrating to 95% B. The flow ratewas 0.9 mL/min and the column temperature 20° C. UDP-glucose wasdetected by UV_(262 nm) absorbance.

The amount of UDP-glucose was quantified by comparing with a calibrationcurve obtained with a commercially available standard (e.g., from SigmaAldrich).

Example 1—Identification of EUGT11

Fifteen genes were tested for RebA 1,2-glycosylation activity. See Table10.

TABLE 10 GenBank Name Source Accession No. EUGT2 Oryza sativa UGT91homolog AP003270 EUGT3 Oryza sativa UGT91 homolog AP005171 EUGT4 Oryzasativa UGT91 homolog AP005643 EUGT6 Oryza sativa UGT91 homolog AP005259EUGT7 Oryza sativa UGT91 homolog AP005171 EUGT8 Oryza sativa UGT91homolog XM_470006 EUGT9 Oryza sativa UGT91 homolog AP005643 EUGT10 Oryzasativa UGT91 homolog AC133334 EUGT11 Oryza sativa UGT91 homolog AC133334EUGT12 Oryza sativa UGT91 homolog AC133334 EUGT15 Petunia × hybrid UGT79homolog Z25802 EUGT16 Arabidopsis thaliana UGT79 homolog AC004786 EUGT17Dianthus caryophyllus UGT79 homolog AB294391 EUGT18 Ipomoea nil UGT79homolog AB192314 EUGT19 Oryza sativa UGT79 homolog NM_001074394

In vitro transcription and translation of these genes was performed, andthe resulting UGTs incubated with RebA and UDP-glucose. Followingincubation, the reactions were analyzed by LC-MS. The reaction mixturecontaining EUGT11 (Rice, AC133334, SEQ ID NO:152) was shown to convertsignificant quantities of RebA to RebD. See LC-MS chromatograms in FIG.4. As shown in the left panel of FIG. 4, UGT91D2e produced a traceamount of RebD when RebA was used as the feedstock. As shown in theright panel of FIG. 4, EUGT11 produced a significant amount of RebD whenRebA was used as the feedstock. Preliminary quantification of the amountof RebD that was produced indicated that EUGT11 was approximately 30times more efficient than UGT91D2e at converting RebA to RebD.

To further characterize EUGT11 and for quantitative comparison toUGT91D2e, the nucleotide sequence encoding EUGT11 (SEQ ID NO: 153,non-codon optimized, FIG. 7) was cloned into two E. coli expressionvectors, one containing an N-terminal HIS-tag and one containing anN-terminal GST-tag. EUGT11 was expressed using both systems andpurified. When the purified enzymes were incubated with UDP-glucose andRebA, RebD was produced.

Example 2—Identification of EUGT11 Reactions

EUGT11 was produced by in vitro transcription and translation, andincubated with various substrates in the RebD pathway. Similarexperiments were carried out using in vitro transcribed and translatedUGT91D2e. FIG. 3 shows a schematic overview of 19-O-1,2-diglycosylationreactions performed by EUGT11 and UGT91D2e. Compounds 1-3 wereidentified solely by mass and expected retention time. The numbers shownin FIG. 3 are the average peak height of the indicated steviol glycosideobtained from a LC-MS chromatogram, and, although not quantitative, canbe used to compare the activity of the two enzymes. EUGT11 and UGT91D2ewere not able to use steviol as a substrate. Both enzymes were able toconvert steviol 19-O-monoglucoside (SMG) to compound 1, with EUGT11being about ten times more efficient than UGT91D2e at converting 19-SMGto compound 1.

Both enzymes were able to convert rubusoside to stevioside withcomparable activity but only EUGT11 was able to convert rubusoside tocompound 2 and compound 3 (RebE). See FIG. 5. The left panel of FIG. 5contains LC-MS chromatograms of the conversion of rubusoside tostevioside. The right panel of FIG. 5 contains chromatograms of theconversion of rubusoside to stevioside, to compound 2, and to compound 3(RebE). Conversion of rubusoside to compound 3 requires two consecutive1, 2-O-glycosylations at the 19- and 13-positions of steviol. UGT91D2ewas able to produce a trace amount of compound 3 (RebE) in oneexperiment whereas EUGT11 produced a significant amount of compound 3.

Both enzymes were able to convert RebA to RebD. However, EUGT11 wasapproximately 30 times better at converting RebA to RebD. Overall, itappears that EUGT11 produces more product than UGT91D2e in all reactions(with similar time, concentrations, temperature, and purity of enzyme)except the conversion of rubusoside to stevioside.

Example 3—Expression of EUGT11 in Yeast

The nucleotide sequence encoding EUGT11 was codon-optimized (SEQ ID NO:154) and transformed into yeast along with nucleic acids encoding allfour UGTs (UGT91D2e, UGT74G1, UGT76G1, and UGT85C2). The resulting yeaststrain was grown in medium containing steviol and steviol glycosidesthat accumulated were analyzed by LC-MS. EUGT11 was required for theproduction of RebD. In other experiments, RebD production has beenobserved with UGT91D2e, UGT74G1, UGT76G1, and UGT85C2.

Example 4—UGT Activity on 19-O-1,2-diglycosylated Steviol Glycosides

The 19-O-1,2-diglycosylated steviol glycosides produced by EUGT11 needfurther glycosylation to be converted to RebD. The following experimentswere performed to determine if other UGTs could use these intermediatesas substrates.

In one experiment, compound 1 was produced in vitro from 19-SMG byeither EUGT11 or UGT91D2e in the presence of UDP-glucose. After boilingthe sample, UGT85C2 and UDP-glucose were added. The sample was analyzedby LC-MS and compound 2 was detected. This experiment indicated thatUGT85C2 can use compound 1 as a substrate.

In another experiment, compound 2 was incubated with UGT91D2e andUDP-glucose. The reaction was analyzed by LC-MS. UGT91D2e was not ableto convert compound 2 to compound 3 (RebE). Incubation of compound 2with EUGT11 and UDP-glucose results in the production of compound 3.UGT76G1 was able to use RebE as a substrate to produce RebD.

This shows that the 19-O-1,2-diglycosylation of the steviol glycosidesis able to take place at any time during production of RebD as thedownstream enzymes are able to metabolize the 19-O-1,2-diglycosylatedintermediates.

Example 5—Comparison of EUGT11 and UGT91D2e Sequence

The amino acid sequence of EUGT11 (SEQ ID NO:152, FIG. 7) and the aminoacid sequence of UGT91D2e (SEQ ID NO:5) were aligned using the FASTAalgorithm (Pearson and Lipman, Proc. Natl. Acad. Sci., 85:2444-2448(1998)). See FIG. 6. EUGT11 and UGT91D2e are 42.7% identical over 457amino acids.

Example 6—Modification of 19-1,2-Diglycosylating Activity of UGT91D2e

Crystal structures are available for a number of UGTs. Generally, theN-terminal half of a UGT is primarily involved with substrate bindingwhereas the C-terminal half is involved in binding the UDP-sugar donor.

Modeling the secondary structure of UGT91D2e onto the secondarystructure of the UGTs that have been crystalized revealed a conservedpattern of secondary structure, despite a highly diverged primarysequence as shown in FIG. 8. The crystal structures of UGT71G1 andUGT85H2 (see, for example H. Shao et al, The Plant Cell November 2005vol. 17 no 11 3141-3154 and L. Li et al., J Mol Biol. 2007370(5):951-63) have been reported. Known loops, alpha-helices andbeta-sheets are indicated on UGT91D2e in FIG. 8. Although the homologyat the primary structure level of these UGTs is fairly low, thesecondary structure appears to be conserved, allowing predictionsregarding the locations of amino acids involved in substrate binding onUGT91D2e based on the location of such amino acids in UGT85H2 andUGT71G1.

Regions commonly involved in substrate binding were superimposed onUGT91D2e and largely shown to coincide with the 22 amino aciddifferences from UGT91D1 (GenBank Accession No. Protein Accession numberAAR06918, GI:37993665). UGT91D1 is highly expressed in Stevia andthought to be a functional UGT. However, its substrate is not a steviolglycoside. This suggests that UGT91D1 has a different substrate, whichmay be defined by the 22 amino acids with which it differs fromUGT91D2e. FIG. 9 is an alignment of the amino acid sequences of UGT91D1and UGT91D2e. The boxes represent areas that are reported to be involvedin substrate binding. The amino acids highlighted in dark grey show the22 amino acid differences between UGT91D1 and UGT91D2e. Stars denoteamino acids that have been shown to be involved in substrate binding inUGTs that have had their crystal structure resolved (more stars underone particular amino acids means substrate binding has been shown withmore than one structure-resolved UGTs). There is a strong correlationbetween the 22 amino acid differences between the two UGT91 s, theregions known to be involved in substrate binding, and the actual aminoacids involved in substrate binding in the crystal structure-resolvedUGTs. This suggests that the 22 amino acid differences between the twoUGT91s are involved in substrate binding.

All 22 altered 91D2es were expressed in a XJb Autolysis E. coli strainfrom a pGEX-4T1 vector. In order to assess the activity of the enzymes,two substrate feeding experiments were performed—in vivo and in vitro.Most mutants had lower activity than wild type, however, 5 mutantsshowed increased activity. This was reproduced by in vitro transcriptionand translation (IVT) and showed that C583A, C631A and T857C haveapproximately 3-fold higher stevioside-forming activity than thewild-type UGT91D2e, whereas C662t and A1313C had approximately twice thestevioside-forming activity (nucleotide numbering). These changes resultin amino acid mutations corresponding to L195M, L211M, V286A; and S221Fand E438A, respectively. The increased activity differed depending onsubstrate, with C583A and C631A showing almost a 10-fold increase using13-SMG as substrate and about a 3-fold increase using rubusoside assubstrate, whereas T857C showed a 3-fold increase when using either13-SMG or rubusoside as substrate.

To investigate if these mutations were additive, a range of doublemutants were made and analyzed for activity (FIG. 10). In thisparticular experiment, a higher wild type level of activity was observedthan the previous four experiments; however, the relative activities ofthe mutations remain the same. As rubusoside accumulates in many of theS. cerevisiae strains expressing the 4 UGTs (UGT74G1, UGT85C2, UGT76G1,and UGT91D2e), the stevioside-forming activity may be more important forincreasing steviol glycoside production. As such, the double mutantC631A/T857C (nucleotide numbering) may be useful. This mutant has beennamed UGT91D2e-b, which contains the amino acid modifications L21 M andV286A. The experiments have been reproduced in vitro using S.cerevisiae-expressed UGT91D2e-mutants.

To improve 19-1,2-diglycosylating activity of UGT91D2e, a directedsaturated mutagenic screen of UGT91D2e of the 22 amino acid differencesbetween UGT91D2e and UGT91D1 was performed. GeneArt's® (LifeTechnologies, Carlsbad, Calif.) site-saturation mutagenesis was used toobtain a library containing each of the mutations. The library wascloned into the BamHI and NotI sites of pGEX4T1 bacterial expressionplasmid expressing the mutated versions of 91D2e as GST fusion proteins,resulting in a new library (Lib #116). Lib #116 was transformed intoXJbAutolysis E. coli strain (ZymoResearch, Orange, Calif.) to produceapproximately 1600 clones containing the 418 expected mutations (i.e.,22 positions with 19 different amino acids at each position). Otherplasmids expressing GST-tagged versions of 91D2e (EPCS 1314), 91D2e-b(EPSC1888) or EUGT11 (EPSC1744) as well as the empty pGEX4T1 (PSB12)were transformed as well.

Screening by LC-MS

To analyze the approximately 1600 mutant clones of UGT91D2e, the E. colitransformants were grown overnight at 30° C. in 1 ml of NZCYM containingampicillin (100 mg/l) and chloramphenicol (33 mg/l), in 96-well format.The next day, 150 μl of each culture was inoculated into 3 ml NZCYMcontaining ampicillin (100 mg/l), chloramphenicol (33 mg/l), arabinose 3mM, IPTG 0.1 mM and ethanol 2% v/v, in 24-well format, and incubated at20° C. and 200 rpm for ˜20h. The following day, cells were spun down andpellets were resuspended in 100 μl of lysis buffer containing 10 mMTris-HCl pH 8, 5 mM MgCl₂, 1 mM CaCl₂ and complete mini proteaseinhibitor EDTA-free (3 tablets/100 ml) (Hoffmann-La Roche, Basel,Switzerland) and frozen −80° C. for at least 15 minutes to promote celllysis. Pellets were thawed at room temperature and 50 μl of DNase mix (1μl of 1.4 mg/ml DNase in H2O (˜80000u/ml), 1.2 μl of MgCl₂ 500 mM and47.8 μl of 4×PBS buffer solution) was added to each well. Plates wereshaken at 500 rpm for 5 min at room temperature to allow degradation ofgenomic DNA. Plates were spun down at 4000 rpm for 30 min at 4° C. andsix 0.1 of the lysates were used in UGT in vitro reactions as describedfor GST-91D2e-b, using rubusoside or rebaudioside A as substrates. Ineach case, the resulting compounds, stevioside or rebaudioside D (rebD),were measured by LC-MS. Results were analyzed in comparison with thestevioside or rebD produced by the lysates expressing the correspondingcontrols (91D2e, 91D2e-b, EUGT11 and the empty plasmid). Clones showingactivity similar to or higher than the ones expressing 91D2e-b wereselected as primary hits.

Half of the 1600 clones and the corresponding controls were assayed fortheir capacity to glycosylate rubusoside and rebaudiosideA. Steviosideand RebD were quantified by LC-MS. Under the conditions used, lysatesfrom clones expressing the native UGT91D2e show activity just aroundbackground with both substrates (approximately 0.5 μM stevioside and 1μM RebD), while clones expressing UGT91D2e-b show consistently improvedproduct formation (>10 μM Stevioside; >1.5 μM RebD). Clones expressingEUGT11 consistently display a higher level of activity, especially usingRebA as substrate. Cutoff for considering clones as primary hits in thescreening was generally set at 1.5 μM for both products, but in somecases was adjusted for each independent assay.

Example 7—EUGT11 Homologs

A Blastp search of the NCBI nr database using the EUGT11 proteinsequence revealed approximately 79 potential UGT homologs from 14 plantspecies (one of which is the Stevia UGT91D1, approximately 67% identicalto EUGT11 in conserved UGT regions but less than 45% overall). Homologswith greater than 90% identity in conserved regions were identified fromcorn, soybean, Arabidopsis, grape, and Sorghum. The overall homology ofthe full-length EUGT11 homologs, at the amino acid level, was only28-68%.

RNA was extracted from plant material by the method described byIandolino et al. (Iandolino et al., Plant Mol Biol Reporter 22, 269-278,2004), the RNeasy Plant mini Kit (Qiagen) according to themanufacturer's instructions, or using the Fast RNA Pro Green Kit (MPBiomedicals) according to the manufacturer's instructions. cDNA wasproduced by AffinityScript QPCR cDNA Synthesis Kit (Agilent) accordingto the manufacturer's instructions. Genomic DNA was extracted using theFastDNA kit (MP biomedicals) according to the manufacturer'sinstructions. PCR was performed on cDNA using either the Dream Taqpolymerase (Fermentas) or the Phusion polymerase (New England Biolabs)and a series of primers designed to amplify the homologs.

PCR-reactions were analyzed by electrophoresis in SyberSafe-containingagarose-TAE gels. DNA was visualized by UV-irradiation in a transilluminator. Bands of the correct size were cut out, purified throughspin columns according to the manufacturer's specifications, and clonedinto TOPO-Zero blunt (for Phusion polymerase-generated products) orTOPO-TA (for Dream Taq-generated products). The TOPO-vectors containingthe PCR-products were transformed into E. coli DH5Bct and plated onLB-agar plates containing the appropriate selective antibiotics. DNA wasextracted from surviving colonies and sequenced. The genes with thecorrect sequence were cut out by restriction digest with SbfI and AscI,cloned into similarly digested IVT8 vector and transformed into E. coli.PCRs were performed on all cloned genes to amplify the gene and flankingregions required for in vitro transcription and translation. Proteinswere produced from the PCR products by in vitro transcription andtranslation using the Promega L5540, TNT T7 Quick for PCR DNA Kitaccording to the manufacturer's instructions. Production of protein wasevaluated by incorporation of ³⁵S-methionine followed by separation bySDS-PAGE and visualization on a Typhoon phosphor-imager.

Activity assays were set up totaling 20% (by volume) of each in vitroreaction, 0.1 mM rubusoside or RebA, 5% DMSO, 100 mM Tris-HCl pH 7.0,0.01 units Fast alkaline phosphatase (Fermentas), and 0.3 mM UDP-glucose(final concentrations). Following incubation at 30° C. for one hour, thesamples were analyzed by LC-MS for production of stevioside and RebD asdescribed above. The UGT91D2e and UGT91D2e-b (double mutant described inExample 6) were used as positive controls, along with EUGT11. Under theinitial assay conditions, clone P 64B (see Table 11) produced a traceamount of product using rubusoside and RebA. Table 11 lists the percentidentity at the amino acid level compared to EUGT11 for the whole lengthof the UGTs, which ranges from 28-58%. High amounts of homology(96-100%) were observed over shorter stretches of sequences, which mayindicate highly conserved domains of plant UGTs.

TABLE 11 List of cloned EUGT11 homologs and their amino acid percentidentity to EUGT11. % identity UGT Accession to EUGT11 P44GXP_002297733.1 32.16 P54A XP_002532392.1 34.20 P51H XP_002325254.1 32.53P55D XP_002533517.1 31.90 P5F AAM12787.1 31.73 P48G XP_002318358.1 33.20P52F XP_002334090.1 32.80 P48F XP_002318358.1 33.00 T4B-b NP_565540.431.19 P56C XP_002533518.1 32.60 T67H XP_002270294.1 34.06 T65ECAN80742.1 34.98 T74G XP_002270331.1 35.48 T65D CAN80742.1 34.98 T69F1XP_003635103.1 34.69 P6B Q66PF2.1 33.20 P6D variant Q66PF2.1 33.60 P64BACE87855.1 34.64 T3F AT5G65550 34.94 P53H XP_002527371.1 33.40 P53FXP_002527371.1 33.40 P46H XP_002303861.1 32.40 2-b NP_199780.1 35.79T70F XP_002275802.1 36.67 T72A XP_002275850.1 36.42 T71G XP_002275824.137.25 P49G XP_002320817.1 35.15 P57H XP_002511902.1 36.23 45 PopXP_002302598.1 34.21 P50G XP_002323718.1 32.86 P50H XP_002323718.1 32.66T73G XP_002281094.1 32.05 63 XP_002458816.1 37.25 P78B NP_001147674.135.33 62 XP_002458815.1 34.06 P9F BAJ84800.1 37.92 T7H NP_001240857.131.30 16-1 BAJ93155.1 58.03 T16H BAJ93155.1 58.03 31TA BAD35324.1 51.81P41G NP_001174664.1 35.40 P37G NP_001051010.1 56.71 P60aH XP_002466606.157.35 12 BAJ89368.1 44.35 P12A BAJ89368.1 44.35 P12H BAJ89368.1 44.35P10B BAJ86656.1 45.16 P58aF XP_002463702.1 43.71 P59aG XP_002463705.143.51 P76H NP_001140711.1 28.81

Example 8—Cell-Free Biocatalytic Production of Reb-D

The cell-free approach is an in vitro system where RebA, stevioside or asteviol glycoside mixture is enzymatically converted to RebD. The systemrequires stoichiometric amounts of UDP-glucose and therefore UDP-glucoseregeneration from UDP and sucrose using sucrose synthase can be used.Additionally, sucrose synthase removes UDP produced during the reaction,which improves conversion to glycosylated products by alleviated productinhibition observed for glycosylation reactions. See, WO 2011/153378.

Enzyme Expression and Purification

UGT91D2e-b (described in Example 6) and EUGT11 are key enzymes thatcatalyze the glycosylation of RebA yielding RebD. These UGTs wereexpressed in bacteria (E. coli) but one of ordinary skill in the artwill appreciate that such proteins also can be prepared using differentmethods and hosts (e.g., other bacteria such as Bacillus sp., yeast suchas Pichia sp. or Saccharomyces sp., other fungi (e.g., Aspergillus), orother organisms). For example, the proteins can be produced by in vitrotranscription and translation or by protein synthesis.

The UGT91D2e-b and EUGT11 genes were cloned in pET30a or pGEX4T1plasmids. Resulting vectors were transformed into an XJb (DE3) AutolysisE. coli strain (ZymoResearch, Orange, Calif.). Initially, E. colitransformants were grown overnight at 30° C. in NZCYM medium, followedby induction with 3 mM arabinose and 0.1 mM IPTG, and further incubationovernight at 30° C. The corresponding fusion proteins were purified byaffinity chromatography using included 6HIS- or GST-tags and standardmethods. One skilled in the art will appreciate that other proteinpurification methods such as gel filtration or other chromatographytechniques also can be used, along with precipitation/crystallization orfractionation with e.g., ammonium sulfate. While EUGT11 was expressedwell using the initial conditions, UGT91D2e-b required severalmodifications to the base protocol to increase protein solubility,including lowering the temperature of the overnight expression from 30°C. to 20° C. and adding 2% ethanol to the expression medium. Generally2-4 mg/L of soluble GST-EUGT11 and 400-800 μg/l of GST-UGT91D2e-b werepurified with this method.

Stability of EUGT11

Reactions were conducted to explore the stability of EUGT11 undervarious RebA to RebD reaction conditions. Omitting the substrate fromthe reaction mixture, EUGT11 was pre-incubated for various periods oftime before substrate was added. Following a pre-incubation of theenzyme in 100 mM Tris-HCl buffer, substrate (100 μM RebA) and otherreaction components (300 μM UDP-glucose, and 10 U/mL AlkalinePhosphatase (Fermentas/Thermo Fisher, Waltham, Mass.)) were added (0, 1,4 or 24 hours after the incubation was started). The reaction was thenallowed to proceed for 20h, after which the reactions were stopped andRebD product-formation measured. Experiments were repeated at differenttemperatures: 30° C., 32.7° C., 35.8° C. and 37° C.

The activity of EUGT11 was reduced rapidly when the enzyme waspre-incubated at 37° C., reaching approximately half activity after 1hour, and having almost no activity after 4 hours. At 30° C., theactivity was not significantly reduced after 4 hours and after 24 hours,approximately one-third of the activity remained. This suggests thatEUGT11 is heat-labile.

To assess the thermal stability of EUGT11 and to compare it with theother UGTs in the steviol glycosylation pathway, denaturationtemperatures of the proteins were determined using differential scanningcalorimetry (DSC). Use of DSC thermograms to estimate denaturationtemperatures, T_(D), is described, for example, by E. Freire in Methodsin Molecular Biology 1995, Vol. 40 191-218. DSC was performed (using6HIS-purified EUGT11, yielding an apparent T_(D) of 39° C.; while whenGST-purified 91D2e-b was used, the measured T_(D) was 79° C. Forreference, the measured T_(D) when using 6HIS-purified UGT74G1, UGT76G1and UGT85C2 was 86° C. in all cases. One of skill in the art willrecognize that enzyme immobilization or addition of thermal protectantscan be added to the reactions to improve stability of the protein.Non-limiting examples of thermal protectants include trehalose,glycerol, ammonium sulphate, betaine, trimethylamine oxide, andproteins.

Enzyme Kinetics

A series of experiments were performed to determine kinetic parametersof EUGT11 and 91D2e-b. For both enzymes, 100 μM RebA, 300 μMUDP-glucose, and 10 U/mL Alkaline Phosphatase (Fermentas/Thermo Fisher,Waltham, Mass.) were used in the reactions. For EUGT11, the reactionswere performed at 37° C. using 100 mM Tris-HCl, pH 7, and 2% enzyme. For91D2e-b, the reactions were performed at 30° C. using 20 mM Hepes-NaOH,pH 7.6, 20% (by volume) enzyme. The initial velocities (V₀) werecalculated in the linear range of a product versus time plot.

To first investigate the linearity intervals, initial time-courses weredone for each enzyme. EUGT11 was assayed at 37° C. for 48h at initialconcentrations of 100 μM RebA and 300 μM UDP-glucose. UGT91D2e-b wasassayed at 37° C. for 24h at initial concentrations of 200 μM RebA and600 μM UDP-glucose. Based on these range-finding studies, it wasdetermined that the initial 10 minutes in the case of EUGT11, and theinitial 20 minutes in UGT91D2e-b would be in the linear range withrespect to product formation, and therefore initial velocities of eachreaction were calculated in those intervals. In the case of EUGT11, RebAconcentrations assayed were 30 μM, 50 μM, 100 μM, 200 μM, 300 μM and 500μM. Concentration of UDP-glucose was always three times theconcentration of RebA and incubation was performed at 37° C. By plottingthe calculated V₀ as a function of the substrate concentrations,Michaelis-Menten curves were generated. By plotting the reciprocal of V₀and the reciprocal of [S], a Lineweaver-Burk graphic was obtained, withy=339.85×+1.8644; R²=0.9759.

V_(max) and K_(M) parameters were determined from the curve-fitLineweaver-Burk data, calculated from the x- and y intercepts (x=0,y=1/V_(max)) and (y=0, x=−1/K_(M)). Additionally, the same parametersalso were calculated by a non-linear least squares regression, using theSOLVER function in Excel. The results obtained with both methods forEUGT11 and RebA are presented in Table 12, along with all the kineticparameters of this example. Results from both Nonlinear Least Square Fitmethod and Lineweaver-Burk plot are presented in Table 12. K_(cat) iscalculated based on V_(max) divided by the approximate amount of proteinin the assay.

TABLE 12 Comparison of kinetic parameters for EUGT11 and UGT 91D2e-b,with RebA or UDP-glucose as substrate. Nonlinear Least Square FitLineweaver-Burk plot Reb A UDP-glucose Reb A UDP-glucose EUGT11 91D2e-bEUGT11 91D2e-b EUGT11 91D2e-b EUGT11 91D2e-b V_(max) (μM.min⁻¹) 0.520.34 0.79 0.19 0.54 0.44 0.78 0.18 K_(cat) (min⁻¹) 8.11 0.32 12.32 0.28.42 0.41 12.1 0.19 K_(M) (μM) 162.5 1150 130 45.1 182.3 1580 118 41.9K_(cat)/K_(M) (min⁻¹ · μM⁻¹) 0.05 0.000275 0.095 0.00454 0.046 0.0002580.102 0.00463

In order to investigate the influence of UDP-glucose concentration inthe glycosylation reaction, as well as the affinity of EUGT11 forUDP-glucose, similar kinetics analysis were performed. EUGT11 wasincubated with increasing amounts of UDP-glucose (20 μM, 50 μM, 100 μM,and 200 μM), maintaining an excess of RebA (500 μM). The kineticparameters were calculated as described above, and shown in Table 12.

In the case of UGT91D2e-b, RebA concentrations assayed were 50 μM, 100μM, 200 μM, 300 μM, 400 μM and 500 μM. Concentration of UDP-glucose wasalways three times the concentration of RebA and incubation wasperformed at 30° C., in the reaction conditions described above forUGT91D2e-b. The kinetic parameters were calculated as previouslydescribed; and the resulting kinetic parameters are shown in Table 12.Additionally, kinetic parameters of UGT91D2e-b towards UDP-glucose weredetermined. UGT91D2e-b was incubated with increasing amounts ofUDP-glucose (30 μM, 50 μM, 100 μM, and 200 μM), maintaining an excess ofRebA (1500 μM). incubation was performed at 30° C., in optimalconditions for UGT91D2e-b. The kinetic parameters were calculated aspreviously described and results are presented in Table 12.

By comparison of the kinetics parameters for EUGT11 and 91D2e-b, it wasconcluded that 91D2e-b has a lower K_(cat) and has lower affinity forRebA (higher K_(M)) although the K_(M) for UDP-glucose of 91D2e-b islower than EUGT11. UGT91D2e-b has a lower K_(cat)/K_(M) which is ameasure of catalytic efficiency, combining information on rate ofcatalysis with a particular substrate (Kcat) and the strength ofenzyme-substrate binding (K_(M)).

Determining the Limiting Factor in Reactions

Under the conditions described above for EUGT11, approximately 25% ofthe RebA administered was converted to RebD. The limiting factor inthese conditions could be either the enzyme, UDP-glucose or RebA.Experiments were set up to distinguish between these possibilities. Astandard assay was allowed to run its course during 4 hours. This wasfollowed by addition of either extra RebA substrate, extra enzyme, extraUDP-glucose or extra enzyme and UDP-glucose. Addition of extra enzymeresulted in a relative increase of the conversion of around 50%, addingextra RebA or UDP-glucose alone did not increase the conversionsignificantly, but the simultaneous addition of enzyme and UDP-glucoseincreased the conversion approximately 2-fold.

Experiments were conducted to examine the limit to this benefit ofadding bolus amounts of UDP-glucose and fresh enzyme in the conversionof RebA to RebD reaction. Additional enzyme or enzyme and UDP-glucosewere added after 1, 6, 24 and 28 hours. In the case of the addition ofboth extra EUGT11 and UDP-glucose, a conversion of more than 70% wasachieved. No other components had a significant effect on theconversion. This indicates that EUGT11 is a primary limiting factor forthe reaction but UDP-glucose also is limiting. As UDP-glucose is presentat 3-fold higher concentration than RebA, this indicates thatUDP-glucose may be somewhat unstable in the reaction mixture, at leastin the presence of EUGT11. Alternatively, as explained below, EUGT11 maybe metabolizing the UDP-glucose.

Inhibition Studies

Experiments were conducted to determine if factors such as sucrose,fructose, UDP, product (RebD) and impurities in the less pure Steviaextracts raw materials inhibited the extent of the conversion of steviolglycoside substrates to RebD. In a standard reaction mixture, excess ofthe potential inhibitors (sucrose, fructose, UDP, RebD, or a commercialblend of steviol glycosides (Steviva, Steviva Brands, Inc., Portland,Oreg.)) were added. Following incubation, RebD-production wasquantified. Addition of 500 μg/ml of the commercial Steviva mix(approximately 60% 1,2-stevioside, 30% RebA, 5% Rubusoside, 2%1,2-bioside, less than 1% of RebD, RebC and others, as evaluated byLC-MS) was not found to be inhibitory, but rather increased the overallRebD production (to around 60 LM from around 30 μM without any addition)well beyond the RebD originally added with the blend (around 5 μM). Fromthe molecules tested, only UDP was shown to have an inhibitory effect onRebD-production at the concentration used (500 μM), as measured byLC-MS. The RebD that was produced was less than 7 μM. This inhibitioncan be alleviated in the in vivo or in vitro reactions for RebDproduction, by including an UDP recycling system to UDP-glucose, eitherby yeast or by an added SUS (sucrose synthase enzyme) in conjunctionwith sucrose. Moreover, when working with lower amounts of UDP-glucose(300 μM), the addition of alkaline phosphatase to remove UDP-G does notincrease the amount of RebD produced in the in vitro glycosylationssubstantially, suggesting that the UDP produced may not be inhibitory atthese concentrations.

RebA Vs Crude Steviol Glycoside Mix

In some experiments, a crude steviol glycoside mix was used as a sourceof RebA instead of purified RebA. As such a crude steviol glycoside mixcontains a high percentage of stevioside along with RebA, UGT76G1 wasincluded in the reactions. In vitro reactions were performed asdescribed above using 0.5 g/l of the Steviva® mix as substrate andenzyme (UGT76G1 and/or EUGT11) and incubated at 30° C. The presence ofsteviol glycosides was analyzed by LC-MS.

When only UGT76G1 was added to the reactions, stevioside was convertedto RebA quite efficiently. An unknown penta-glycoside (with a retentiontime peak at 4.02 min) also was detected. When only EUTG11 was added tothe reaction, large amounts of RebE, RebA, RebD and an unknownsteviol-pentaglycoside (with a retention time peak at 3.15 min) werefound. When both EUGT11 and UGT76G1 were added to the reactions, thestevioside peak was reduced, and almost entirely converted to RebA andRebD. There were trace amounts of the unknown steviol-pentaglycoside(peak at 4.02 min). No RebE was detected nor was the second unknownsteviol-pentaglycoside (peak at 3.15 min). This result indicated thatthe use of stevia extracts as a substrate to produce RebD in vitro ispossible when EUGT11 and UGT76G1 are used in combination.

Non-Specific UDP-Glucose Metabolism

To determine if EUGT11 can metabolize UDP-glucose independently of theconversion of RebA to RebD, GST-purified EUGT11 was incubated in thepresence or absence of RebA substrate, and UDP-glucose usage wasmeasured as UDP-release, using the TR-FRET Transcreener® kit (BellBrookLabs). The Transcreener® kit is based on a tracer molecule bound to anantibody. The tracer molecule is displaced by UDP or ADP in a highlysensitive and quantitative manner. The FP kit includes an Alexa633tracer bound to an antibody. The tracer is displaced by UDP/ADP. Thedisplaced tracer freely rotates leading to a decrease in fluorescencepolarization. Therefore, UDP production is proportional to a decrease inpolarization. The FI kit includes a quenched Alexa594 Tracer bound to anantibody, which is conjugated to an IRDye® QC-1 quencher. The tracer isdisplaced by UDP/ADP, whereby the displaced tracer is un-quenched,leading to a positive increase in fluorescence intensity. Therefore, UDPproduction is proportional to an increase in fluorescence. A TR-FRET kitincludes a HiLyte647 Tracer bound to an Antibody-Tb conjugate.Excitation of the terbium complex in the UV range (ca. 330 nm) resultsin energy transfer to the tracer and emission at a higher wavelength(665 nm) after a time delay. The tracer is displaced by UDP/ADP causinga decrease in TR-FRET.

It was observed that UDP-glucose measured was the same independent ofthe presence of RebA substrate. UDP release was not detectable in theabsence of enzyme. This indicates a non-specific degradation ofUDP-glucose by EUGT11. Nevertheless, RebD was still produced when RebAwas added, suggesting that EUGT11 would preferentially catalyze RebAglycosylation over the non-specific UDP-glucose degradation.

Experiments were set up to find out the destiny of the glucose moleculein the absence of RebA or other obvious glycosylation substrates. Onecommon factor in all previous reactions was the presence of Tris bufferand/or trace amounts of glutathione, which both contain potentialglycosylation sites. The effect of these molecules on the non-specificUDP-glucose consumption was assayed using GST-purified EUGT11 (withglutathione) and HIS-purified enzyme (without glutathione) in in vitroreactions, in the presence or absence of RebA. UDP-glucose usage wasmeasured as UDP-release, using the TR-FRET Transcreener® kit. UDPrelease occurred in all cases and was independent of the presence ofRebA. UDP release was slower when the HIS-purified enzyme was used, butthe overall catalytic activity of the enzyme in conversion of RebA toRebD was also lower, suggesting a lower amount of active soluble enzymepresent in the assay. Therefore, it appears that the UDP-glucosemetabolism by EUGT11 is independent of the presence of substrate andindependent of the presence of glutathione in the reaction, under theconditions tested.

To test the effect of Tris on the metabolism of UDP-glucose by EUGT11,GST-EUGT11 was purified using a Tris- or a PBS-based buffer for theelution, obtaining similar amounts of protein in both cases. Tris- andPBS-purified enzymes were used in in vitro reactions using Tris andHEPES as buffers respectively, in the presence or absence of RebA in asimilar manner as above. In both conditions, the UDP release was thesame in the reactions whether RebA was added or not, indicating that themetabolism of UDP-glucose by EUGT11 is independent of both the presenceof RebA and Tris in the reaction. This suggests that the UDP-releasedetected may somehow be an artifact caused by a property of EUGT11 or,alternatively, EUGT11 may be hydrolyzing UDP-glucose. EUGT11 is stillefficient at converting RebA to RebD preferentially and the loss ofUDP-G can be compensated by addition of the sucrose synthase recyclingsystem described below.

RebA Solubility

The solubility of RebA determines the concentration that can be usedboth for the whole-cell approach and for the cell-free approach. Severaldifferent solutions of RebA in water were made and left at roomtemperature for several days. After 24 hours of storage, RebAprecipitated at concentrations of 50 mM or higher. Twenty-five mM RebAstarted to precipitate after 4-5 days, while 10 mM or lowerconcentrations remained in solution, even when stored at 4° C.

RebD Solubility

The solubility of RebD was assessed by making several differentsolutions of RebD in water were made and incubated at 30° C. for 72hours. RebD was found to be soluble in water initially in concentrationsof 1 mM or lower while concentrations of 0.5 mM or less were found to bestable for longer periods of time. One with skill in the art willrecognize that the solubility can be influenced by any number ofconditions such as pH, temperature, or different matrices.

Sucrose Synthase

Sucrose synthase (SUS) has been used to regenerate UDP-glucose from UDPand sucrose (FIG. 11) for other small molecule glycosylations (MasadaSayaka et al. FEBS Letters 581 (2007) 2562-2566.). Three SUS1 genes fromA. thaliana, S. rebaudiana and coffee (Coffea arabica) were cloned intopGEX4T1 E. coli expression vectors (see FIG. 17 for the sequences).Using methods similar to those described for EUGT11, around 0.8 mg/l ofGST-AtSUS1 (A. thaliana SUS1) was purified. Initial expression of CaSUS1(Coffea arabica SUS1) and SrSUS1 (S. rebaudiana SUS1) followed byGST-purification did not produce significant amounts of proteinalthough, when analyzed by western blot, the presence of GST-SrSUS1 wasverified. When GST-SrSUS1 was expressed at 20° C. in the presence of 2%ethanol, approximately 50 μg/l of enzyme was produced.

Experiments were performed to evaluate the UDP-glucose regeneratingactivity of the purified GST-AtSUS1 and GST-SrSUS1. In vitro assays wereconducted in 100 mM Tris-HCl pH=7.5 and 1 mM UDP (final concentration).Either ˜2.4 μg of purified GST-AtSUS1, ˜0.15 μg of GST-SrSUS1, or ˜1.5μg commercial BSA (New England Biolabs, Ipswich, Mass.) were also added.Reactions were done in presence or absence of ˜200 mM sucrose andincubated at 37° C. for 24h. Product UDP-glucose was measured by HPLC asdescribed in the analytical section. AtSUS1 produced ˜0.8 mM UDP-glucosewhen sucrose was present. No UDP-glucose was observed when SrSUS1 or thenegative control (BSA) was used. The lack of activity observed forSrSUS1 could be explained by the poor quality and concentration of thepurified enzyme. UDP-glucose production by AtSUS1 was sucrose dependentand, therefore, it was concluded that AtSUS1 can be used in a coupledreaction to regenerate the UDP-glucose used by EUGT1 or other UGTs forsmall molecule glycosylation (FIG. 11, above).

SUS catalyzes the formation of UDP-glucose and fructose from sucrose andfrom UDP as depicted in FIG. 11. This UDP-glucose then can be used byEUGT11 for glycosylation of RebA to produce RebD. In vitro assays asdescribed above were performed, adding ˜200 mM sucrose, 1 mM UDP, 100 μMRebA, ˜1.6 μg purified GST-AtSUS1 and ˜0.8 μg GST-EUGT11. Formation ofproduct, RebD, was evaluated by LC-MS. When AtSUS, EUGT11, sucrose andUDP were mixed with RebA, 81±5 μM of RebD was formed. The reaction wasdependent on the presence of AtSUS, EUGT11 and sucrose. The conversionrate was similar to what has been observed previously using UDP-glucoseprovided extraneously. This shows that AtSUS can be used to regenerateUDP-glucose for RebD-formation by EUGT11.

Example 9: Whole-Cell Biocatalytic Production of RebD

In this example, several parameters were studied that are factors forusing whole cell biocatalytic systems in the production of RebD fromRebA or other steviol glycosides. The ability of raw materials to crossthe cell membrane and availability of UDP-glucose are two such factors.Permeabilizing agents were studied as well as different cell types toascertain which systems may be the most beneficial for RebD production.

Permeabilizing Agents

Several different permeabilization agents have previously been shown toallow intracellular enzymatic conversion of various compounds that arenormally not able to cross a cell membrane (Chow and Palecek, BiotechnolProg. 2004 March-April; 20(2):449-56). In several cases, the approachesresemble a partial lysis of the cells and, in yeast, often rely on theremoval of the cell membrane by a detergent and the encapsulation of theenzymes inside of the remaining cell wall, which is permeable to smallermolecules. Common to these methods is the exposure to the permeabilizingagent followed by a centrifugation step to pellet cells before theaddition of the substrate. See, for example, Flores et al., EnzymeMicrob. Technol., 16, pp. 340-346 (1994); Presecki & Vasic-Racki,Biotechnology Letters, 27, pp. 1835-1839 (2005); Yu et al., J IndMicrobiol Biotechnol, 34, 151-156 (2007); Chow and Palecek, Cells.Biotechol. Prog., 20, pp. 449-456 (2004); Fernandez et al., Journal ofBacteriology, 152, pp. 1255-1264 (1982); Kondo et al., Enzyme andmicrobial technology, 27, pp. 806-811 (2000); Abraham and Bhat, J IndMicrobiol Biotechnol, 35, pp. 799-804 (2008); Liu et al., Journal ofbioscience and bioengineering, 89, pp. 554-558 (2000); and Gietz andSchiestl, Nature Protocols, 2, pp. 31-34 (2007) regardingpermeabilization of yeast. See, Naglak and Wang, Biotechnology andBioengineering, 39, pp. 732-740 (1991); Alakomi et al., Applied andenvironmental Microbiology, 66, pp. 2001-2005 (2000); and Fowler andZabin, Journal of bacteriology, 92, pp. 353-357 (1966) regardingpermeabilization of bacteria. As described in this example, it wasdetermined if cells could remain viable and therefore could retain denovo UDP-glucose biosynthesis.

Experiments were done to establish conditions for permeabilization in E.coli and in yeast. Growing cells (S. cerevisiae or E. coli) were treatedwith different concentrations/combinations of permeabilization agents:toluene, chloroform and ethanol for permeabilization of S. cerevisiae,and guanidine, lactic acid, DMSO and/or Triton X-100 forpermeabilization of E. coli. Tolerance of both model organisms to highconcentrations of RebA and other potential substrates also wasevaluated. The permeabilization was measured by the amount of RebDproduced from a EUGT11-expressing organism after incubation in a RebAcontaining medium (feeding experiment). Enzyme activity was monitoredbefore and after exposure to the permeabilizing agents by lysing thecells and analyzing the activity of the released UGTs in an in vitroassay.

In yeast, none of the permeabilization conditions tested resulted in anincrease on RebD above the detected background (i.e., contaminating RebDlevels present in the RebA stock used for feeding). This indicates that,under the tested conditions, yeast cells remain impermeable to RebAand/or the reduced cell viability caused by the solvents results in adecrease of EUGT11 activity as well.

In E. coli, none of the conditions tested resulted in permeabilizationof the cells and subsequent production of RebD above background levels.Detectable levels of RebD were measured when lysates from strainsexpressing EUGT11 were used in the in vitro reactions (data not shown),indicating that EUGT11 enzyme is present and active even after allpermeabilization treatments (though the level of activity varies). Thepermeabilization treatments had little or no effect on cell viability,except treating cultures with 0.2 M guanidine and 0.5% TritonX-100,which severely decreased viability.

S. cerevisiae also was subjected to permeabilization assays not allowingfurther growth of the cells using Triton X-100, N-lauryl sarcosine (LS),or Lithium acetate+polyethylene glycol (LiAc+PEG). That is, under theseconditions, permeabilization renders the cells unviable by removing thecell membrane altogether while retaining the cell-wall as a barrier tokeep enzymes and gDNA inside. In such methods, UDP-glucose can besupplemental or recycled as described above. The advantage ofpermeabilization versus the purely in vitro approach is that individualenzymes do not need to be separately produced and isolated.

N-Lauryl sarcosine treatment resulted in inactivation of EUGT11 and onlya minor increase in RebD was detected when LiAc/PEG was applied (datanot shown). Treatment with Triton X-100 0.3% or 0.5%, however, increasedthe amount of RebD above background levels (see FIG. 18) whilesustaining the activity of EUGT11. For Triton X-100 assays, overnightcultures were washed three times in PBS buffer. Cells corresponding to 6OD₆₀₀ units were resuspended in PBS containing 0.3% or 0.5% Triton X-100respectively. Treated cells were vortexed and incubated 30 minutes at30° C. After treatment, cells were washed in PBS buffer. Cellscorresponding to 5 OD₆₀₀ units were used in an in vitro assay, asdescribed for GST-EUGT11 and 0.6 OD₆₀₀ units were resuspended inreaction buffer and incubated overnight at 30° C. as described for theLS treated samples. Untreated samples were used as controls.

Lysates from transformants expressing EUGT11 were able to convert someRebA into RebD (8 to 50 μM were measured in the reactions) when cellswere untreated or after treatment with LiAC/PEG or Triton X100. However,no RebD was measured in lysates of cell pellets treated with LS.Permeabilized but non-lysed cells were able to produce some RebD (1.4 to1.5 μM measured) when treated with 0.3% or 0.5% Triton X100 (FIG. 18)while no RebD was found on the samples treated with LS or LiAC/PEG.These results show that RebD can be produced from RebA biocatalyticallyusing whole cells and using Triton X100 as the permeabilizing agent.

Example 10—Assessment of Codon Optimized UGT Sequences

Optimal coding sequences for UGT 91d2e, 74G1, 76G1, and 85C2 weredesigned and synthesized for yeast expression using two methodologies,supplied by GeneArt (Regensburg, Germany) (SEQ ID NOs: 6, 2, 8, and 4,respectively) or DNA 2.0 (Menlo Park, Calif.) (SEQ ID NOs: 84, 83, 85,and 82, respectively). The amino acid sequences of UGT 91d2e, 74G1,76G1, and 85C2 (SEQ ID NOs: 5, 1, 7, and 3, respectively) were notchanged.

The wild-type, DNA 2.0, and GeneArt sequences were assayed for in vitroactivity to compare reactivity on substrates in the steviol glycosidespathway. UGTs were inserted in high copy (2u) vectors and expressed froma strong constitutive promoter (GPD1) (vectors P423-GPD, P424-GPD,P425-GPD, and P426-GPD). The plasmids were transformed individually intothe universal Watchmaker strain, EFSC301 (described in Example 3 ofWO2011/153378) and assays were carried out using cell lysates preparedfrom equal amount of cells (8 OD units). For the enzymatic reactions, 6μL of each cell lysate were incubated in a 30 μL reaction with 0.25 mMsteviol (final concentration) to test UGT74G1 and UGT85C2 clones, andwith 0.25 mM 13-SMG (13SMG) (final concentration) to test 76G1 and 91D2eUGTs. Assays were carried out for 24 hours at 30° C. Prior to LC-MSanalysis, one volume of 100% DMSO was added to each reaction, sampleswere centrifuged at 16000 g, and the supernatants analysed.

The lysates expressing the GeneArt-optimized genes provided higherlevels of UGT activity under the conditions tested. Expressed as apercentage of the wild-type enzyme, the GeneArt lysates showedequivalent activity to the wild-type for UGT74G1, 170% activity forUGT76G1, 340% activity for UGT85C2 and 130% activity for UGT91D2e. UsingUGT85C2 may improve the overall flux and productivity of cells forproduction of Reb-A and Reb-D when expressed in S. cerevisiae.

Further experiments were conducted to determine if the codon-optimizedUGT85C2 could reduce 19-SMG accumulation and increase rubusoside andhigher glycosylated steviol glycosides production. The production of19-SMG and rubusoside were analysed in a steviol-feeding experiment ofS. cerevisiae strain BY4741 expressing the wild type UGT74G1 as well asthe codon-optimized UGT85C2 from high copy (211) vectors under strongconstitutive promoter (GPD1) (vectors P426-GPD and P423-GPD,respectively). Whole culture samples (without cell removal) were takenand boiled in an equal volume of DMSO for total glycosides levels.Intracellular concentrations reported were obtained by pelleting cells,and resuspending in 50% DMSO to the volume of the original culturesample taken, followed by boiling. The “total” glycosides level and thenormalized intracellular level then were measured using LC-MS. Usingwild type UGT74G1 and wild type UGT85C2, approximately 13.5 μMrubusoside was produced in total with a maximum normalized intracellularconcentration of about 7 μM. In contrast, when wild type UGT74G1 andcodon-optimized UGT85C2 were used, a maximum of 26 μM rubusoside wasproduced, or approximately double of what was produced using the wildtype UGT85C2. Additionally, the maximum normalized intracellularconcentration of rubusoside was 13 μM, again an approximate doubling ofwhat was produced using wild type UGT85C2. Intracellular concentrationof 19-SMG was significantly reduced from a maximum of 35 μM using thewild type UGT85C2 to 19 μM using the codon-optimized UGT85C2.Consequently, about 10 μM less total 19-SMG was measured for thecodon-optimized UGT85C2. This shows that more 19-SMG is converted intorubusoside and confirms that the wild type UGT85C2 is a bottleneck.

During diversity screening, another homolog of UGT85C2 was discoveredduring Stevia rebaudiana cDNA cloning. The homolog has the followingcombination of conserved amino acid polymorphisms (with respect to theamino acid numbering of the wild-type S. rebaudiana UGT85C codingsequence set forth in Accession No. AY345978.1): A65S, E71Q, T270M,Q289H, and A389V. This clone, termed UGT85C2 D37, was expressed throughcoupled in vitro transcription-translation of PCR products (TNT®T7 Quickfor PCR DNA kit, Promega). The expression product was assayed forglycosylation activity using steviol (0.5 mM) as the sugar acceptor, asdescribed in WO/2011/153378 with the exception that assays were allowedto incubate for 24 hours. As compared to the wildtype UGT85C2 controlassay, the D37 enzyme appears to have approximately 30% higherglycosylation activity.

Example 11—Identification of a Novel S. rebaudiana KAH

A partial sequence (GenBank Accession No. BG521726) was identified inthe Stevia rebaudiana EST data base that had some homology to a SteviaKAH. The partial sequence was blasted against raw Stevia rebaudianapyrosequencing reads using CLC main workbench software. Reads thatpartially overlapped with the ends of the partial sequence wereidentified and used to increase the length of the partial sequence. Thiswas done several times until the sequence encompassed both the start-and the stop codons. The complete sequence was analyzed for frameshiftmutations and nucleotide substitutions that may have been introduced byblasting the complete sequence against the raw pyrosequencing reads. Theresulting sequence was designated SrKAHe1. See FIG. 12.

Activity of the KAH encoded by SrKAHe1 was assessed in vivo in S.cerevisiae background strain CEN.PK 111-61A, which expresses genesencoding enzymes constituting the entire biosynthetic pathway from theyeast secondary metabolites isopentenyl pyrophosphate (IPP) and farnesylpyrophosphate (FPP) to steviol-19-O-monoside, except the steviolsynthase enzyme that converts ent-kaurenoic acid to steviol.

Briefly, the S. cerevisiae strain CEN.PK 111-61A was modified to expressan Aspergillus nidulans GGPPS, a 150 nt truncated Zea mays CDPS (with anew start codon, see below), a S. rebaudiana KS, a S. rebaudiana KO andthe S. rebaudiana UGT74G1 from chromosomally integrated gene copies,with TPI1 and GPD1 yeast promoters driving transcription. The CEN.PK111-61A yeast strain that expresses all of these genes was designatedEFSC2386. Thus, strain EFSC2386 contained the following integratedgenes: Aspergillus nidulans Geranyl geranyl pyrophosphate synthase(GGPPS); Zea mays ent-Copalyl diphosphate synthase (CDPS); Steviarebaudiana ent-Kaurene synthase (KS); Stevia rebaudiana ent-kaureneoxidase (KO); and Stevia rebaudiana UGT74G1; in combination with thepathway from IPP and FPP to steviol-19-O-monoside, without a steviolsynthase (KAH).

Expression of different steviol synthases (from episomal expressionplasmids) was tested in strain EFSC2386 in combination with theexpression of various CPRs (from episomal expression plasmids), andproduction of steviol-19-O-monoside was detected by LC-MS analysis ofculture sample extracts. The nucleic acids encoding the CPRs wereinserted in the multi cloning site of the p426 GPD basic plasmid whilethe nucleic acids encoding the steviol synthases were inserted in themulti cloning site the p415 TEF basic plasmid (p4XX basic plasmid seriesby Mumberg et al., Gene 156 (1995), 119-122). Production ofsteviol-19-O-monoside occurs when a functional steviol synthase enzymeis present.

The KAHs that were expressed from episomal expression plasmids in strainEFSC2386 were “indKAH” (Kumar et al, Accession no. DQ398871; Reeja etal., Accession No. EU722415); “KAH1” (S. rebaudiana steviol synthasefrom Brandle et al., U.S. Patent Publication No. 2008/0064063 A1);“KAH3” (A. thaliana steviol synthase from Yamaguchi et al., U.S. PatentPublication No. 2008/0271205 A1); “SrKAHe1” (S. rebaudiana steviolsynthase cloned from S. rebaudiana cDNA as described above); and“DNA2.0.SrKAHe1” (codon optimized sequence (DNA2.0) encoding S.rebaudiana steviol synthase, see FIG. 12B).

The CPRs that were expressed from episomal expression plasmids in strainEFSC2386 were “CPR1” (S. rebaudiana NADPH dependent cytochrome P450reductase (Kumar et al., Accession no. DQ269454); “ATR1” (A. thalianaCPR, Accession No. CAA23011, see also FIG. 13); “ATR2” (A. thaliana CPR,Accession No. CAA46815, see also FIG. 13); “CPR7” (S. rebaudiana CPR,see FIG. 13, CPR7 is similar to “CPR1”); “CPR8” (S. rebaudiana CPR,similar to Artemisia annua CPR, see FIG. 13); and “CPR4” (S. cerevisiaeNCP1 (Accession No. YHR042W, see also FIG. 13).

Table 13 provides the levels of steviol-19-O-monoside (M) in strainEFSC2386 with the various combination of steviol synthases and CPRs.

TABLE 13 19-SMG Production 19-SMG Strain production (μM) “indKAH” CPR10.000 “indKAH” ATR1 0.000 “indKAH” ATR2 0.000 “indKAH” CPR7 0.000“indKAH” CPR8 0.000 “indKAH” CPR4 0.000 “KAH1” CPR1 0.000 “KAH1” ATR10.000 “KAH1” ATR2 0.000 “KAH1” CPR7 0.000 “KAH1” CPR8 0.000 “KAH1” CPR40.000 “KAH3” CPR1 5.300 “KAH3” ATR1 5.921 “KAH3” ATR2 0.000 “KAH3” CPR75.693 “KAH3” CPR8 0.000 “KAH3” CPR4 0.000 “SrKAHe1” CPR1 20.129“SrKAHe1” ATR1 15.613 “SrKAHe1” ATR2 40.407 “SrKAHe1” CPR7 33.724“SrKAHe1” CPR8 41.695 “SrKAHe1” CPR4 28.949 “DNA2.0.SrKAHe1” CPR1 26.065“DNA2.0.SrKAHe1” ATR1 26.974 “DNA2.0.SrKAHe1” ATR2 54.354“DNA2.0.SrKAHe1” CPR7 30.797 “DNA2.0.SrKAHe1” CPR8 50.956“DNA2.0.SrKAHe1” CPR4 30.368

Only KAH3 and the steviol synthase encoded by SrKAHe1 had activity whenexpressed in S. cerevisiae. The DNA 2.0. codon optimized SrKAHe1sequence encoding steviol synthase resulted in a level ofsteviol-19-O-monoside accumulation that was approximately one order ofmagnitude higher as compared with a codon optimized KAH3 when each wereco-expressed with optimal CPRs. In the experiments presented in thisexample, the combination of KAH1 and ATR2 CPR did not result in theproduction of steviol-19-O-monoside.

Example 12—Pairings of CPRs and KO

The CEN.PK S. Cerevisiae EFSC2386 strain and the CPRs referred to inthis Example are described in the Example 11 (“Identification of S.rebaudiana KAH”). EFSC2386 contained the following integrated genes:Aspergillus nidulans Geranyl geranyl pyrophosphate synthase (GGPPS); Zeamays ent-copalyl diphosphate synthase (CDPS); Stevia rebaudianaent-kaurene synthase (KS); and Stevia rebaudiana ent-kaurene oxidase(KO). This strain produces ent-kaurenoic acid that was detected by LC-MSanalysis.

A collection of cytochrome P450 reductases (CPRs) were expressed andtested in strain EFSC2386; “CPR1” (S. rebaudiana NADPH dependentcytochrome P450 reductase, Kumar et al., Accession no. DQ269454); “ATR1”(A. thaliana CPR, Accession No. CAA23011); “ATR2” (A. thaliana CPR,Accession No. CAA46815); “CPR7” (S. rebaudiana CPR, CPR7 is similar to“CPR1”); “CPR8” (S. rebaudiana CPR, similar to Artemisia annua CPR; and“CPR4” (S. cerevisiae NCP1, Accession No. YHR042W).

Overexpression of the S. cerevisiae endogenous native CPR (referred toas CPR4 in Table 14), and especially overexpression of one of the A.thaliana CPRs namely ATR2, gives good activation of the Steviarebaudiana kaurene oxidase (the latter called KO1 in Table 14) andresults in increased accumulation of ent-kaurenoic acid. See Table 14,which presents the area under curve (AUC) of the ent-kaurenoic acid peakin the LC-MS chromatograms. KO1 is an ent-kaurenoic acid producing yeastcontrol strain without additional overexpression of CPRs.

TABLE 14 Effect of Different Cytochrome P450 Reductase Enzymes with KO-1Cytochrome P450 Ent-Kaurenoic Acid Reductase (AUC) CPR-1 14113 ATR-113558 ATR-2 29412 CPR-7 18918 CPR-8 12590 CPR-4 25103 Control 16593

Example 13—Evaluating KS-5 and KS-1 in Steviol Pathways

The yeast strain EFSC1972 is a CEN.PK 111-61A S. cerevisiae strain thathas the biosynthetic pathway from IPP/FPP to rubusoside expressed byintegrated gene copies encoding the Aspergillus nidulans GGPPS (internalname GGPPS-10), the Stevia rebaudiana KS (KS1, SEQ ID NO:133), theArabidopsis thaliana KAH (KAH-3, SEQ ID NO:144), the Stevia rebaudianaKO (KO1, SEQ ID NO:138), the Stevia rebaudiana CPR (CPR-1, SEQ ID NO:147), the full length Zea mays CDPS (CDPS-5, SEQ ID NO:158), the Steviarebaudiana UGT74G1 (SEQ ID NO:1) and Stevia rebaudiana UGT85C2 (SEQ IDNO:3). Furthermore EFSC1972 has down regulation of the ERG9 geneexpression by displacement of the endogenous promoter with the cupperinducible promoter CUP1.

When EFSC1972 is transformed with a CEN/ARS-based plasmid that expressesthe Stevia rebaudiana SrKAHe1 from a TEF1 promoter, and simultaneouslytransformed with 2-based plasmids that express the Synechococcus spGGPPS (GGPPS-7) and a truncated version of the Zea mays CDPS (truncatedCDPS-5) from a GPD promoter, the result is growth-impaired S. cerevisiaeproducer of rubusoside (and 19-SMG). This strain is referred to as the“enhanced EFSC1972” in the following text. To determine whether the slowgrowth rate is caused by accumulation of the toxic pathway intermediateent-copalyl diphosphate, a collection of kaurene synthase (KS) genes wasexpressed in the “enhanced EFSC1972” strain then growth and steviolglycoside production was assessed.

Expression of the A. thaliana KS (KS5) results in improved growth andsteviol glycoside production of the “enhanced EFSC1972” strain. See FIG.16. The same positive effect on growth cannot be achieved by furtheroverexpression of the Stevia rebaudiana kaurene synthase (KS-1) in theenhanced EFSC1972 (data not shown).

Example 14—Yeast Strain EFSC1859

Saccharomyces cerevisiae strain EFSC1859 contains GGPPS-10, CDPS-5,KS-1, KO-1, KAH-3, CPR-1 and UGT74G1 coding sequences integrated intothe genome and expressed from the strong constitutive GPD1 and TPIpromoters. See Table 15. In addition, the endogenous promoter for theyeast ERG9 gene was replaced with the copper inducible promoter CUP1 fordownregulation of the ERG9 squalene synthase. In standard yeast growthmedium, the ERG9 gene is transcribed at very low levels, since theconcentration of copper in such medium is low. The decrease inergosterol production in this strain results in increased amounts ofisoprene units available for isoprenoid biosynthesis. In addition,strain EFSC1859 also expresses UGT85C2 from a 2 micron multicopy vectorusing a GPD1 promoter. EFSC1859 produces rubusoside and steviol19-O-glycoside.

Zea mays CDPS DNA, with and without the chloroplast signal peptide, wasexpressed from a 2 micron multicopy plasmid using the GPD promoter. Thenucleotide sequence and amino acid sequence of the Zea mays CDPS areforth in FIG. 14. The chloroplast signal peptide is encoded bynucleotides 1-150 and corresponds to residues 1 to 50 of the amino acidsequence.

TABLE 15 gi Accession Gene Source Enzyme Designation Number No.Aspergillus GGPP GGPPS-10 29468175 AF479566 nidulans synthase (C301) Zeamays CDP CDPS-5 50082774 AY562490 synthase (EV65) Stevia Kaurene KS-14959241 AAD34295 rebaudiana synthase Stevia KO KO-1 76446107 ABA42921rebaudiana Arabidopsis KAH KAH-3 15238644 NP_197872 thaliana SteviaUGT74G1 rebaudiana Stevia UGT85C2 rebaudiana Stevia CPR CPR-1 93211213ABB88839 rebaudiana

EFSC1859+maize full-length CDPS plasmid, and EFSC+maize truncated CDPSplasmid were grown in selective yeast medium with 4% glucose. Rubusosideand 19-SMG production were measured by LC-MS to estimate the productionlevel. The removal of the plastid leader sequence did not appear toincrease steviol glycoside production as compared to the wild-typesequence, and demonstrates that the CDPS transit peptide can be removedwithout causing a loss of steviol glycoside biosynthesis.

Example 15—Yeast Strain EFSC1923

Saccharomyces cerevisiae strain CEN.PK 111-61A was modified to producesteviol glycosides by introduction of steviol glycoside pathway enzymesfrom various organisms. The modified strain was designated EFSC1923.

Strain EFSC1923 contains an Aspergillus nidulans GGPP synthase geneexpression cassette in the S. cerevisiae PRP5-YBR238C intergenic region,a Zea mays full-length CDPS and Stevia rebaudiana CPR gene expressioncassette in the MPT5-YGL176C intergenic region, a Stevia rebaudianakaurene synthase and CDPS-1 gene expression cassette in the ECM3-YOR093Cintergenic region, an Arabidopsis thaliana KAH and Stevia rebaudiana KOgene expression cassette in the KIN1-INO2 intergenic region, a Steviarebaudiana UGT74G1 gene expression cassette in the MGA1-YGR250Cintergenic region and a Stevia rebaudiana UGT85C2 gene expressioncassette integrated by displacing the TRP1 gene ORF. See Table 15. Inaddition, the endogenous promoter for the yeast ERG9 gene was replacedwith the copper inducible promoter CUP1.

Strain EFSC1923 produced approximately 5 μM of the steviol glycoside,steviol 19-O-monoside, on selective yeast medium with 4% glucose.

Example 16—Expression of a Truncated Maize CDPS in Yeast Strain EFSC1923

The 150 nucleotides at the 5′ end of the Zea mays CDP synthase codingsequence in Table 15 (SEQ ID NO:157, see FIG. 14) was deleted, theremainder of the coding sequence was provided with a new translationstart ATG, and the truncated sequence was operably linked to the GPD1promoter in the multicopy plasmid p423GPD in Saccharomyces cerevisiaeEFSC1923. Plasmid p423GPD is described in Mumberg, D et al, Gene, 156:119-122 (1995). EFSC1923 and EFSC1923 plus p423GPD-Z.m.tCDPS were grownin for 96 hours in selective yeast medium containing 4% glucose. Theamount of steviol 19-O-monoside produced by EFSC1923+p423GPD-Z.m.tCDPS(the truncated Zea mays CDPS) under these conditions was approximately2.5 fold more than that produced by EFSC1923 without the plasmid.

The Arabidopsis thaliana KAH coding sequence from Table 15 was insertedin a multicopy plasmid designated p426GPD, under the control of the GPD1promoter. Plasmid p426GPD is described in Mumberg, D et al, Gene, 156:119-122 (1995). No significant difference was observed between theamount of steviol 19-O-monoside produced by EFSC1923+p426GPD-A.t.KAH,and EFSC1923 lacking the plasmid.

EFSC1923 was transformed with both p423GPD-Z.m.tCDPS and p426p426GPD-A.t.KAH. Surprisingly, the amount of steviol 19-O-monosideproduced under these conditions by EFSC1923 harboring both plasmids(i.e., the truncated Zea mays CDPS and Arabidopsis KAH) was more than 6fold greater than the amount produced by EFSC1923 alone.

A bifunctional CDPS-KS from Gibberella fujikuroi (NCBI Accession no:Q9UVY5.1, FIG. 15) was cloned and compared to the truncated CDPS-5. Thebifunctional Gibberella CDPS-KS was cloned into a 2μ plasmid with a GPDpromoter and transformed with a plasmid expressing the Arabidopsisthaliana KAH-3 from a 2 g based-plasmid from a GPD promoter intoEFSC1923. In shake flask studies, this bifunctional CDPS-KS was about5.8 times more active in producing steviol 19-O-monoside than strainEFSC1923 with the KAH-3 alone. However, it was found to be less optimalthan the KAH-3 and truncated CDPS combination under the conditionstested. Therefore, further strains were constructed with KS-5 andtruncated CDPS.

Example 17—Toxicity of Intermediates

The effect on S. cerevisiae vitality of geranyl geranyl pyrophosphate(GGPP), ent-copalyl diphosphate (CDP) or ent-kaurene production wasinvestigated by expression of Synechococcus sp GGPPS alone (GGPPproduction), the GGPPS and the 50 amino acid N-terminally truncated Zeamays CDPS (see Example 16) together (CDP production), or the GGPPS,truncated CDPS and the Arabidopsis thaliana kaurene synthase (KS5)together (ent kaurene production) in the laboratory S. cerevisiae strainCEN.PK background. Genes were expressed from 2 t plasmids with GPDpromoters driving transcription of truncated CDPS and KS5, whiletranscription of the GGPPS was driven by the ADH1 promoter. The growthof S. cerevisiae CEN.PK transformed with various combinations of theseplasmids (GGPPS alone; GGPPS+truncated CDPS; or GGPPS+truncatedCDPS+KS5) or plasmids without gene insertions was observed. GGPPproduction, and especially CDP production, was toxic to S. cerevisiaewhen produced as end products. Interestingly, ent-kaurene appeared tonot be toxic to yeast in the amounts produced in this experiment.

Example 18—Disruption of Endogenous Phosphatase Activity

The yeast genes DPP1 and LPP1 encode phosphatases that can degrade FPPand GGPP to farnesol and geranylgeraniol, respectively. Thegene-encoding DPP1 was deleted in strain EFSC1923 (described in Example15) to determine if there was an effect on steviol glycoside production.When this dpp1 mutant strain was further transformed with a plasmidexpressing the Z. mays CDPS lacking the chloroplast transit sequence(Example 16), both small and large transformants emerged. Strains of the“large colony” type produced ˜40% more 19-SMG as compared to “smallcolony” type and the non-DPP1 deleted strain, under the conditionstested. These results indicate that deletion of DPP1 can have a positiveeffect on steviol glycoside production and that the degradation ofprenyl pyrophosphates in yeast therefore could influence steviolglycoside production negatively.

Example 19—Construction of a Genetically Stable Yeast Reporter StrainProducing Vanillin Glucoside from Glucose with a Disrupted SUC2 Gene

A yeast strain producing vanillin glucoside from glucose was createdbasically as described in Brochado et al. ((2010) Microbial CellFactories 9:84-98) (strain VG4), but with additional integration intothe ECM3 inter-locus region in the yeast genome of an expressioncassette with E. coli EntD PPTase controlled by the yeast TPI1 promoter(as described in Hansen et al. (2009) Appl. Environ. Microbiol.75(9):2765-2774), disruption of SUC2 by replacing coding sequence with aMET15 expression cassette, and disruption of LEU2 by replacing codingsequence with a TnSble expression cassette conferring resistance tophleomycin. The resulting yeast strain was called V28. This strain alsoencodes a recombinant A. thaliana UDP-glycosyltransferase (UGT72E2,GenBank Accession No. Q9LVR1) having the amino acid sequence set forthin FIG. 19 (SEQ ID NO:178).

Example 20—Expression of Sucrose Transporter and Sucrose Synthase inYeast Already Biosynthesizing Vanillin Glucoside

A sucrose transporter SUC1 from Arabidopsis thaliana was isolated by PCRamplification from cDNA prepared from A. thaliana, using proof-readingPCR polymerase. The resulting PCR fragment was transferred byrestriction digestion with SpeI and EcoRI and inserted into thecorresponding in the low copy number yeast expression vector p416-TEF (aCEN-ARS based vector), from which the gene can be expressed from thestrong TEF promoter. The resulting plasmid was named pVAN192. Thesequence of the encoded sucrose transporter is set forth in FIG. 19B(GenBank Accession No. AEE35247, SEQ ID NO:179).

A sucrose synthase SUS1 from Coffea arabica (Accession No. CAJ32596)from was isolated by PCR amplification from cDNA prepared from C.arabica, using proof-reading PCR polymerase. The PCR fragment wastransferred by restriction digestion with SpeI and SalI and insertedinto the corresponding position in the high copy number yeast expressionvector p425-GPD (a 2 μm based vector), from which the gene can beexpressed from the strong GPD promoter. The resulting plasmid was namedpMUS55. The sequence of the encoded sucrose synthase is set forth inFIG. 19C (GenBank Accession No. CAJ32596; SEQ ID NO:180).

pVAN192 and pMUS55 were introduced into the yeast strain V28 by genetictransformation, using a lithium acetate transformation protocol,creating the yeast strain V28::pVAN192::pMUS55. A control strain wasmade by transforming V28 with the empty plasmids P146-TEF and P425-GPD.

These two yeast strains were grown in 200 ml cultures in 500 mlErlenmeyer shake flasks using SC (synthetic complete) growth mediumwithout aromatic amino acids supplemented with 2% glucose and 2% sucroseand adjusted to pH 5.0. Cultures were incubated at moderate revolution(150 rpm), at 30° C. for 72 hours. Samples were taken at 72 hours, andthe content of vanillin glucoside determined. As can be seen from thetable below, VG production in the control strain (containing emptyplasmids p416-TEF and p425-GPD) was 330 mg/L VG, while the yeast strainV28::pVAN192::pMUS55 expressing sucrose synthase and sucrose transporterproduced 445 mg/l VG, corresponding to a 34.8% increase in VGproduction.

Vanillin glucoside Strain (g/L after 72 h) V28 (p416-TEF + P425-GPD) 330V28::pVAN192::pMUS55 445

This indicates that co-expression of a sucrose synthase and a sucrosetransporter together with a glucosyltransferase increased the ability toglycosylate a small molecule aglycon, and concentration of theglycosylated aglycon was significantly increased. In this case, asignificant improvement in vanillin glucosylation was achieved,resulting in a significant increase in titer of the end product,vanillin-O-β-glucoside.

Example 21—Improved Steviol Glycoside Producing Strains

Strain construction of Saccharomyces cerevisiae EFSC2763 EFSC2763 yeaststrain is derived from a wild type Saccharomyces cerevisiae straincontaining three auxotrophic modifications, namely the deletions ofURA3, LEU2 and HIS3. The genetics of the strain have been stabilized andcan be used as a regular diploid or haploid yeast strain. EFSC2763 hasbeen converted to a steviol glycoside producing yeast bygenomic-integration of four DNA constructs. Each construct containsmultiple genes that were introduced into the yeast genome by homologousrecombination. Furthermore, construct one and two were assembled byhomologous recombination.

The first construct contains eight genes and is inserted in the DPP1locus and disrupts and partially deletes DPP1 (see Example 18). The DNAinserted contains: the A. gossypii TEF promoter expressing the NatMXgene (selectable marker) followed by the TEF terminator from A.gossypii; Gene Art codon optimized S. rebaudiana UGT85C2 (see Example10) expressed from the native yeast GPD1 promoter and followed by thenative yeast CYC1 terminator; S. rebaudiana CPR-8 (see FIG. 13)expressed using the TPI1 promoter followed by the native yeast TDH1terminator, A. thaliana Kaurene synthase (KS-5, see Example 13, SEQ IDNO:156) expressed from the PDC1 promoter and followed by the nativeyeast FBA1 terminator; Synechococcus sp. GGPPS (GGPPS-7) expressed usingthe TEF2 promoter and followed by the native yeast PFI1 terminator;DNA2.0 codon-optimized S. rebaudiana KAHe1 (see Example 11, SEQ IDNO:165), expressed from the TEF1 promoter and followed by the ENO2terminator; S. rebaudiana KO-1 expressed using the FBA1 promoter andfollowed by the native yeast TDH2 terminator; and Zea mays truncatedCDPS (see Example 14) expressed using the PGK1 promoter and followed bythe native yeast ADH2 terminator.

The second construct was inserted at the YPRCA15 locus and contains thenative yeast TEF promoter from A. gossypii in front expressing the KanMXgene (selectable marker) followed by the TEF terminator from A.gossypii, the Gene Art codon optimized A. thaliana ATR2 (see FIG. 13B)expressed from the PGK1 promoter followed by the yeast ADH2 terminator,S. rebaudiana UGT74G1 expressed from the TPI1 promoter followed by theyeast TDH1 terminator, Gene Art codon-optimized S. rebaudiana UGT76G1expressed from the TEF1 promoter followed by the yeast ENO2 terminator,and GeneArt codon-optimized S. rebaudiana UGT91D2e-b (see Example 6)expressed from the GPD1 promoter and followed by the yeast CYC1terminator.

The first and the second construct were combined in the same spore cloneby mating and dissection. This yeast strain was subsequently transformedwith construct three and four in two successive events.

Construct three was integrated between genes PRP5 and YBR238C andcontained the TEF promoter from A. gossypii in expressing the K. lactisLEU2 gene followed by the TEF terminator from A. gossypii, the GPD1promoter expressing the DNA2.0-optimized S. rebaudiana KAHe1 followed bythe CYC1 terminator, and the TPI1 promoter expressing the Zea maystruncated CDPS. Construct four was integrated in the genome betweengenes ECM3 and YOR093C with an expression cassette containing the TEFpromoter from A. gossypii expressing the K. pneumoniae hph gene followedby the TEF terminator from A. gossypii, Synechococcus sp. GGPPSexpressed from the GPD1 promoter followed by the CYC1 terminator, andthe TPI1 promoter expressing the A. thaliana Kaurene synthase. The fourutilized genetic markers were subsequently removed.

As analyzed by LC-MS following the DMSO-extraction of total steviolglycosides from cells and broth, EFSC2772 produces between 40-50 μM or2-3 μM/OD600 Rebaudioside A, after growth for four days in 3 ml SC(Synthetic Complete) media at 30° C. with 320 RPM shaking in deep-wellplates.

Strain Construction of Saccharomyces cerevisiae EFSC2772

EFSC2772 is very similar to strain 2763 with the exception that thegenetic markers were not removed, and the strain was made prototrophicby introduction of the two plasmids p413TEF (public domain CEN/ARSshuttle plasmid with HIS3 marker) and p416-TEF (public domain CEN/ARSshuttle plasmid with URA3 marker) by transformation, and designatedEFSC2772.

As analyzed by LC-MS following the DMSO-extraction of total steviolglycosides from cells and broth, EFSC2772 produces similar levels ofRebaudioside A as 2763, after growth in deep-well plates. Higher opticaldensities and higher titers were obtained through aerobic fed-batchgrowth in 2L (working volume) fermentors which included a ˜16 hourgrowth phase in the base medium (Synthetic Complete media) followed by˜100 hours of feeding with glucose utilized as the carbon and energysource combined with trace metals, vitamins, salts, and Yeast NitrogenBase (YNB) and/or amino acid supplementation. The pH was kept near pH 5and the temperature setpoint was 30° C. As evidenced by LC-MS, combinedcellular and extracellular product concentrations were between 920-1660mg/L of Reb-A and approximately 300-320 mg/L of Reb-D in the twodifferent experiments, approximately 700 mg/L of Reb-A was detected inthe broth when the higher titer results were obtained. Additionally alarge peak was seen for Reb-B, and one skilled in the art will recognizethat additional copies of UGT74G1 or upregulation of UGT74G1 willfurther increase the conversion of RebB to RebA.

Strain EFSC2743 was made in a similar manner as above, but without thetwo plasmids conferring prototrophy and with the addition of a p416(CEN/ARS)-based plasmid expressing EUGT11 from the TEF promoter. Thisstrain was grown in a fed-batch fermentation as above. This strainproduced a total amount of RebD of 920 mg/L and furthermoreapproximately a 9:1 ratio of RebD to RebA was seen. Approximately 360mg/L of RebD was found in the broth.

Example 22—UDP-Glucose Capacity

In Example 21, it was shown that yeast can fully glycosylate over 1 mMsteviol e.g., to RebD, RebB, and RebA. Similarly, Saccharomyces strainsare able to glycosylate as much as 60 mM of other small moleculeproducts (data not shown). However, the glycosylation limit of the yeastnative UDP-glucose regenerating system is unknown, or the rate at whichit replenishes the UDP-glucose pool needed for cell wall synthesis.Therefore, experiments were designed to investigate if an increase inUDP-glucose production would increase the glycosylation rate in yeast. Asuc2 deletion mutant was transformed with plasmids harboring the A.thaliana suc1 gene encoding a sucrose transporter, UGT74G1 and A.thaliana SUS. UGT74G1 can rapidly glycosylate steviol to steviol19-O-monoglucoside (19-SMG). Transformants were pre-grown overnight in13-ml culture tubes containing 4 ml of SC medium lacking leucine,histidine and uracil. The next day, cells corresponding to 2 OD₆₀₀ unitswere spun down and resuspended in 2 ml of fresh media was containing 2%sucrose and or 100 μM steviol. Cultures were shaken at 30° C. for 3 daysin culture tubes. After 1h, 3h, 6h, 21h and 46h, aliquots were taken.Aliquots of 100 μl of culture were spun down and an equal volume of DMSOwas added. Samples were vortexed, heated at 80° C. for 15 minutes,centrifuged, and the 19-SMG content analyzed by LC-MS. No difference inthe rate of glycosylation of steviol was observed between wild-type andSUS1-augmented strains at the time points tested. This suggests thatglycosylation of steviol by UGT74G1 proceeds at a slower rate thanUDP-glucose is regenerated by the yeast and that extra UDP-glucose maynot be needed to achieve high titers of small molecule glycosylation invivo. Nevertheless, the use of a SUS to recycle UDP-glucose in vitro isshown in Example 8 and therefore its use in an in vivo system isexpected to increase the rate of production of steviol glycosides, ifUDP-glucose should become limiting.

Example 23—Reb-C and Reb-F Production In Vivo from Glucose

Production of RebC from steviol

Previous experiments (Publication No. WO/2011/153378) have shown thatrecombinantly expressed Arabidopsis thaliana RHM2 (rhamnose synthetase,locus tag AT1G53500) is able to convert UDP-glucose to UDP-rhamnose.This UDP-rhamnose can be used to producesteviol-13-O-glucopyranosyl-1,2-rhamnoside, when incubated with UGT91D2eand steviol-13-O-monoglucoside in vitro.

Further experiments were conducted to confirm production of RebC fromsteviol by expressing all 4 UGTs and the RHM2 in yeast in vivo, followedby steviol feeding. EFSC301 strain (MAT alpha, lys2ADE8his3ura3leu2trp1) was transformed with the following plasmids expressingwild type gene sequences: p424GPD expressing wild type UGT74G1(Accession no: AY345982); p423GPD expressing wild type 85C2 (Accessionno.: AY345978.1); and a p426GPD derived-plasmid expressing wildtypeUGT76G1 (Accession no: AY345974) and UGT91D2e under GPD promoters.Plasmid p425GPD expressing either RHM2 or an empty p425GPD controlplasmid was cotransformed with the UGTs. Transformants were pre-grownovernight in 13 mL culture tubes containing 2-3 ml of SC medium lackingleucine, histidine, tryptophan and uracil. The next day, after growthhad reached 0.4 OD₆₀₀ units, cells were spun down, resuspended in freshmedium containing 25 μM steviol and shaken at 30° C. for 3 days inculture tubes. An aliquot of 100 μL of culture was spun down. An equalvolume of DMSO was added to the supernatant of this sample while 200 μLof 50% DMSO was added to the pellet. Samples were vortexed, heated at80° C. for 15 minutes, centrifuged, and the steviol glycoside contentanalyzed by LC-MS. RebC was detected in growth media and cellularextracts only when the RHM2 gene was coexpressed with the UGTs.Quantification showed that approximately equal amounts of RebA and RebCwere produced. This shows that RHM2 is able to produce significantquantities of UDP-rhamnose in vivo and that UGT91D2e is capable ofefficient rhamnosylation in vivo. Two other compounds were observed viaLC-MS with retention times of 5.64 and 5.33 minutes and m/z ratioscorresponding to steviol with 1 glucose- and 1 rhamnose (steviol-1,2rhamnobioside), and 2 glucoses- and 1 rhamnose (Dulcoside A),respectively. This suggests that the remaining UGTs in the steviolglycoside pathway are capable of accepting rhamnosylated intermediates,i.e, the rhamnosylation step does not need to occur last.

In addition, a series of sequential in vitro experiments were conductedto determine whether any dead-end reactions occur in the rebaudioside Cpathway. See FIG. 2B. For example, the rhamnosylation activity ofUGT91D2e on rubusoside and subsequent conversion of the product to RebCby UGT76G1 was demonstrated using in vitro reactions. In thisexperiment, UGT91D2e and RHM2 recombinantly expressed in E. coli andpurified were incubated overnight with rubusoside, NADPH, NAD⁺ andUDP-glucose. The reaction mixture was subsequently boiled to denaturethe enzymes. An aliquot of the reaction was added to an enzymepreparation of UGT76G1 with UDP-glucose. The rubusoside was converted inthe presence of UGT91D2e and RHM2 to a compound with m/z correspondingto steviol with 2-glucoses and 1-rhamnose. Subsequently, this compoundwas converted in the presence of UGT76G1 to RebC, which indicates thatthe intermediate is Dulcoside A. This experiment therefore demonstratesthat UGT91D2e is able to rhamnosylate rubusoside and that UGT76G1 isable to convert the product to RebC.

Similarly, it was shown through in vitro reactions that rhamnosylationof 13-SMG by UGT91D2e (forming a steviol compound with one glucose andone rhamnose) and subsequent formation of a compound with 2 glucoses and1 rhamnose by UGT76G1. This compound has a unique retention time (4.56min) and is thought to be steviol 13-O-1,3-diglycoside-1,2-rhamnoside.This compound also was observed when steviol was fed to yeast expressingthe four UGTs and RHM2.

From the current data, it is shown that UGT91D2e is able to rhamnosylate13-SMG and rubusoside. It is also shown that UGT74G1 and UGT76G1 areable to metabolize the rhamnosylated compound produced by UGT91D2e from13-SMG. When these compounds are incubated with the remaining UGT(UGT74G1 or UGT76G1 depending on which UGT was used for the previousstep), RebC is formed. This indicates that the order of glycosylation isof little importance as UGT74G1 and UGT76G1 are able to glycosylaterhamnosylated substrates.

Production of RebC from Glucose

Plasmids expressing RHM2 and UGTs 76G1 and 91D2e were transformed into astable rubusoside producer, the EFSC1923 strain (see Example 15). Thisyeast is a Saccharomyces cerevisiae CEN.PK 111-61A derivative with theUGTs 85C2 (Accession no.: AY345978.1) and 74G1 (Accession no: AY345982)integrated into the genome as well as auxotrophic modifications. Instrain EFSC1923 (see Example 15), expression of squalene synthase, whichis encoded by ERG9, was downregulated by displacement of the endogenouspromoter with the CUP1 copper-inducible promoter. Strain EFSC1923 alsocontains an Aspergillus nidulans GGPP synthase (GGPPS-10) expressioncassette in the S. cerevisiae PRP5-YBR238C intergenic region, a Zea maysfull-length CDPS (CDPS-5) and Stevia rebaudiana CPR (CPR-1) geneexpression cassette in the MPT5-YGL176C intergenic region, a Steviarebaudiana Kaurene synthase and CDPS (KS-1/CDPS-1) gene expressioncassette in the ECM3-YOR093C intergenic region, an Arabidopsis thalianaKAH (KAH-3) and Stevia rebaudiana KO (KO-1) gene expression cassette inthe KIN1-INO2 intergenic region, a Stevia rebaudiana UGT74G1 geneexpression cassette in the MGA1-YGR250C intergenic region and a Steviarebaudiana UGT85C2 gene expression cassette integrated by displacing theTRP1 gene ORF20. Inserted steviol pathway genes are described in Table11 of published PCT WO/2011/153378.

EFSC1923 strain was transformed with a p423GPD-derived plasmidexpressing wildtype UGT74G1 and UGT85C2 sequences using GPD promotersand a p426GPD-derived plasmid expressing wildtype UGT76G1 (Accession no:AY345974) and UGT91D2e (see SEQ ID NO:5) under the control of GPDpromoters. Plasmid p425GPD expressing Arabidopsis thaliana RHM2 (enzymelocus tag AT1G53500) or an empty p425GPD control plasmid wasco-transformed. Transformants were pre-grown overnight in 13-ml culturetubes containing 2-3 ml of SC medium lacking leucine, histidine anduracil. The next day, when the culture reached an OD₆₀₀ of 0.4 units itwas centrifuged, resuspended in fresh medium, and shaken at 30° C. for 3days in culture tubes. One hundred μL of culture were spun down; to thisan equivalent volume of DMSO was added to the supernatant while 200 μLof 50% DMSO was added to the pellet. Samples were vortexed, heated at80° C. for 15 minutes, spun down and the steviol glycoside contentanalyzed by LC-MS.

Analyses of the medium and normalized intracellular content of thisstrain showed production of RebC. Approximately 8 g M RebC and 4 μM RebAwas produced as determined by LC-MS. Furthermore, the intermediatesproduced following steviol feeding were not detected in this experiment.Accumulation of RebC was strictly dependent on expression of RHM2. Thisexample demonstrates de novo biosynthesis of RebC from glucose.

Production of Additional Steviol Glycosides from Steviol and Glucose

Using the same GPD-based plasmids described above, the stablesteviol-producing strain EFSC1923 containing UGT74G1 and UGT85C2 wastransformed with the UGTs required to produce RebB (UGT76G1 andUGT91D2e/EUGT11), RebE (UGT91D2c/EUGT11) and dulcoside A (RHM2,UGT91D2c/EUGT11). Wildtype EUGT11 (NCBI: NP_001051007), which was foundto have higher diglycosylation activity, was cloned into p424GPD forthis experiment. Transformants were pre-grown overnight in 13-ml culturetubes containing 2-3 ml of SC medium lacking leucine, histidine,tryptophan and uracil. The next day, after growth had reached 0.4 OD₆₀₀units, cells were spun down, resuspended in fresh medium containing 25μM steviol (except for glucose experiments) and shaken at 30° C. for 3days in culture tubes. An aliquot of 100 L of culture was spun down. Anequal volume of DMSO was added to the supernatant of this sample while200 μL of 50% DMSO was added to the pellet. Samples were vortexed,heated at 80° C. for 15 minutes, centrifuged, and the steviol glycosidecontent analyzed by LC-MS. LC-MS analyses confirmed in vivo productionof RebB, RebE, and Dulcoside A in S. cerevisiae from glucose or steviol.See, e.g., FIGS. 2A and 2B. A higher concentration of steviol-glycosideswas observed following steviol-feeding (as judged by chromatograms).

Characterization of RehF Pathway Intermediates Using EUGT11.

The xylosylating properties of UGT91D2e and EUGT11 were compared invitro. By using UDP-xylose as the sugar-donor, UGT91D2e was previouslyshown to xylosylate steviol-13-O-monoglucoside forming a keyintermediate in RebF biosynthesis (Publication No. WO/2011/153378).Similar in vitro experiments using EUGT11 and UGT91D2e have shown thatthese UGTs are capable of xylosylating rubusoside. When UGT91D2e isused, the LC-MS analysis shows a new peak with an m/z ratiocorresponding to steviol with 2 glucose molecules and 1 xylose. See,FIG. 26. Because of the shift in the retention time this peak is thoughtto correspond to rubusoside xylosylated on the 13-O-glucose. When EUGT11is used, the LC-MS analysis shows two new, similar sized peaks atretention time 3.99 and 4.39 minutes with m/z ratios corresponding tosteviol with 2 glucoses and 1 xylose. These products most likelycorrespond to rubusoside xylosylated on either of the two positions—the13-O-glucose or the 19-O-glucose.

Production of RebF from Glucose

In vivo production of RebF requires cloning of UGD1 (UDP-glucosedehydrogenase) and USX3 (UDP-glucoronic acid decarboxylase) fromArabidopsis for production of UDP-xylose. UGD1 and UXS3 were inserted ina high copy (24) vector, derived from P425-GPD, containing twoexpression cassettes, and expressed from strong constitutive promoters(TPI1 and GPD1, respectively). The plasmid was transformed into the RebAproducer strain EFSC2763 (described in Example 21) and cultivated during3 days in selection medium (SC-leu). The LC-MS results clearly show theappearance of a new peak at retention time 4.13 minutes with m/z ratioscorresponding to steviol with 3 glucoses and 1 xylose and identified asRebF (based on a commercial RebF standard), as well as other new peakswith m/z ratios corresponding to steviol with 2 glucoses and 1 xylose(as above), indicating that UGT91D2e was capable of carrying outxylosylation in vivo. These peaks were not seen in the negativecontrols.

Example 24—Effect of Squalene Synthase (ERG9) Down Regulation Using aHeterologous Insert

In yeast such as Saccharomyces cerevisiae, the mevalonate pathwayproduces a number of isoprenoid phosphate intermediates in thebiosynthetic pathway to squalene (See FIG. 20). The squalene synthase inyeast is ERG9. See GenBank Accession No. P29704.2 for the Saccharomycescerevisiae squalene synthase; P36596 for the Schizosaccharomyces pombesqualene synthase; Q9Y753 for the Yarrowia lipolytica squalene synthase;Q9HGZ6 for the Candida glabrata squalene synthase; Q752X9 for the Ashbyagossypii squalene synthase; 074165 for the Cyberlindnerajadinii squalenesynthase; P78589 for the Candida albicans squalene synthase; P38604 forthe Saccharomyces cerevisiae lanosterol synthase; P37268 for the Homosapiens squalene synthase; P53798 for the Mus musculus squalenesynthase; and Q02769 for the Rattus norvegicus squalene synthase. SeeFIG. 25 (SEQ 1D NOs:192-202).

Introduction of Stem Loop Structure in 5′UTR of ERG9 Gene

The wild-type ERG9 promoter region was replaced with the CYC1 promotersequence and a 5′UTR sequence by homologous recombination. The 5′UTRregion contains a sequence that can form a stemloop structure. See SEQID NOs. 181-183. SEQ ID NO:184 is another sequence that also can beused.

SEQ ID NO: 181 (heterologous insert 1): TGAATTCGTTAACGAATTCSEQ ID NO: 182 (heterologous insert 2): TGAATTCGTTAACGAACTCSEQ ID NO: 183 (heterologous insert 3): TGAATTCGTTAACGAAGTCSEQ ID NO: 184 (heterologous insert 4): TGAATTCGTTAACGAAATT

Without being bound to a particular mechanism, the stemloop maypartially block the 5′-3′ directed ribosomal scanning for the AUG andreduce the translation of the transcript. Stemloops with differentdegree of basepairing were tested to find stemloops that reduced theERG9 transcript translation sufficiently to boost FPP levels withoutaffecting the growth of the yeast strain.

DNA fragments encompassing an ERG9 promoter upstream sequence (forhomologous recombination), an expression cassette for the gene (NatR)that confers resistance to Nourseothricin, a CYC1 promoter (SEQ ID NO:185, FIG. 21), a 5′ UTR sequence with a stemloop structure, and an ERG9ORF sequence (for homologous recombination) were generated by PCR. DNAfragments that contained either the CYC1 promoter or the KEX2 promoter(SEQ ID NO: 186) but no stemloops were also generated as controls. Theflanking ERG9 sequences for recombination as well as the stemloopstructure were introduced via the PCR oligos. An overview of theconstruct for homologous recombination is shown in FIG. 22. The DNAfragments were transformed into an S. cerevisiae host strain thatsubsequently was selected on nourseothricin containing growth plates.Clones with successful exchange of the native ERG9 promoter with theCYC1 promoter and stemloop-containing 5′UTR sequence were identified.Overview and sequence of the stem-loop region is provided in FIG. 23.The sequence identified as 5% corresponds with the heterologous inserthaving SEQ ID NO:181; the sequence identified as 20% corresponds withthe heterologous insert having SEQ ID NO: 182; and the sequenceidentified as 50% corresponds with the heterologous insert having SEQ IDNO:183.

Assessment of FPP Accumulation (Boosting Effect)

The Amorpha-4,11-diene Synthase (ADS) gene catalyzes the chemicalreaction that turns one FPP molecule into Amorpha-4,11-diene in theplant Artemisia annua. The gene is functional and efficient in S.cerevisiae and can be used to indirectly assess the accumulation of FPPin the strains with the stemloop structure introduced in theheterologous 5′UTR of the ERG9 gene. An S. cerevisiae codon optimizednucleic acid encoding ADS (GenBank Accession No. AAF61439) was cloned ona multicopy plasmid (2.) under the control of the PGK1 promoter andtransformed in the wild type and engineered S. cerevisiae strains.Amorpha-4,11-diene production was measured and compared to the standardcompound caryophyllene, as described by (Ro et al. 2006. Nature440(7086):940-943; Paradise et al. Biotechnol Bioeng. 2008 Jun. 1;100(2):371-8; Newman et al. Biotechnol Bioeng 95(4):684-691).

Chemicals

Dodecane and caryophyllene were purchased from Sigma-Aldrich (St. Louis,Mo.). Complete Supplement Mixtures for formulation of Synthetic Complete(SC) media were purchased from Formedium (UK). All others chemical werepurchased from Sigma-Aldrich.

Yeast Cultivation

Engineered yeast strains were grown in SC 2% glucose with uracil droppedout. Cultures were grown at 30° C. overnight and then used to inoculatemain cultures in 250 mL shake flasks containing 25 mL SC medium, andgrown to an optical density of 0.1 at 600 nm. The main cultures weregrown for 72h at 30° C. Because amorphadiene at very low concentrationsis volatile from aqueous cultures, 2.5 mL dodecane was added to eachculture flask in order to trap and retain the amorphadiene produced. 10μl of the dodecane layer was sampled and diluted 100 fold in ethylacetate for quantification by GC-MS

GC-MS Analysis of Amorphadiene

GC-MS was used to measure amorphadiene production from yeast cultures.Samples were analysed using the method as follow: The GC oventemperature program used 80° C. for 2 min, followed by a ramping of 30°C./min to 160° C., then 3° C./min up to 170° C., and finally 30° C./minup to 300° C. with a 2 min final hold. Injector and MS quadrupoledetector temperatures were 250° C. and 150° C., respectively. 1 μL wasinjected in split less mode. The MS was operated in full scan mode.Amorphadiene concentration was calculated in (-)-tran-caryophylleneequivalents using a caryophyllene standard curve using the total ions.

The analysis of the different strains, including the different promoterconstructs, showed an increased production of amorphadiene ((2.5×) whenusing the heterologous insert having the nucleotide sequence set forthin SEQ ID NO: 181 compared to either no insert or the inserts having thenucleotide sequences set forth in SEQ ID NO:182 and 183. See FIG. 24.The heterologous insert set forth in SEQ ID NO:181 has the most stablesecondary structure. For comparison the wild type yeast, with unmodifiedERG9, was also analyzed (FIG. 24: CTRL-ADS) and this strain showed evenlower production of amorphadiene. Conversely, the construct thatcomprised the very weak promoter ScKex2 showed an even higher level ofamorphadiene (6×).

Example 25—Analysis of the Effect of Squalene Synthase (ERG9) DownRegulation and GGPPS Overexpression on GGPP Production

Assessment of GGPP Accumulation

S. cerevisiae contains a GGPPS (BTS1). In addition to BTS1 there areseveral heterologous GGPPS enzymes that are functional and efficient inS. cerevisiae. When a functional GGPPS is overexpressed in S.cerevisiae, it leads to accumulation of GGPP, which may be converted togeranylgeraniol (GGOH) by the S. cerevisiae enzymes DPP1 and LPP1. TheGGOH is partly exported to the yeast culture medium. GGOH can bemeasured by GC-MS and its accumulation can indirectly be used to assessthe potential pool of GGPP that is available for enzymes that use GGPPas substrate.

Four different GGPPSs (GGPPS-1 (S. acidicaldarius, see Table 7), GGPPS-2(A. nidulans, FIG. 25, SEQ ID NO:203), GGPPS-3 (S. cerevisiae, BTS1,FIG. 25, SEQ ID NO:167), and GGPPS-4 (M. musculus, see Table 7)) wereassessed. The nucleotide sequences encoding GGPPS-1, GGPPS-2, andGGPPS-4 were S. cerevisiae codon optimized. All nucleic acids encodingthe GGPPS polypeptides were cloned on a multitcopy plasmid (2a) underthe control of the PGK1 promoter and transformed in two different ERG9down regulated strains: KEX2-ERG9 and CYC (5%)-ERG9 (see Example 24).

Engineered yeast strains were grown in SC 2% glucose with uracil droppedout. Complete Supplement Mixtures for formulation of Synthetic Complete(SC) media were purchased from Formedium (UK). All others chemical werepurchased from Sigma-Aldrich (St. Louis, Mo.). All optical densitymeasurements were done at OD 600 nm. Cultures were grown at 30° C.overnight and then used to inoculate 250 ml unbaffled culture flaskscontaining 25 ml SC medium at an OD600 of 0.1. The main cultures weregrown for 72h at 30° C.

To measure GGOH accumulation, yeast cells (pellet) and yeast culturemedium (supernatant) were extracted separately and then combined beforeanalysis by GC-MS. The supernatant was extracted with Hexane in a 1:1ratio. The pellet was first subjected to a saponification in solutioncontaining 20% KOH and 50% Ethanol and the lysed cells were finallyextracted with Hexane in a 1:1 ratio. The GC oven temperature programused was 80° C. for 2 min, followed by a ramp to 160° C. at 30° C./min,then to 170° C. at 3° C./min and finally to 300° C. at 30° C./min with a2 min hold. Injector and MS quadrupole detector temperatures were 250°C. and 150° C., respectively. 2 ul was injected in split less mode. TheMS was operated in full scan mode.

When the GGPPS were overexpressed in the CYC1(5%)-ERG9 strain orKEX2-ERG9 strain, there was a significant increase in GGOH (GGPP)production observed with all four GGPPS polypeptides compared to thecontrol where no GGPPS was expressed. Notably, the CYC1(5%)-ERG9 strainshowed a 2-4 fold higher GGOH (GGPP) accumulation than the KEX2-ERG9strain. The results are shown in FIG. 26.

Other Embodiments

It is to be understood that while the invention has been described inconjunction with the detailed description thereof, the foregoingdescription is intended to illustrate and not limit the scope of theinvention, which is defined by the scope of the appended claims. Otheraspects, advantages, and modifications are within the scope of thefollowing claims.

1. A method for producing a target steviol glycoside or a target steviolglycoside composition, comprising contacting a starting compositioncomprising a steviol, a precursor steviol glycoside having a13-O-glucose, a 19-O-glucose, or both a 13-O-glucose and a 19-O-glucose,and/or a mixture thereof with a first uridine 5′-diphospho (UDP)glycosyl transferase polypeptide capable of beta 1,2 glycosylation of aC2′ of the 13-O-glucose, the 19-O-glucose, or both the 13-O-glucose andthe 19-O-glucose of the precursor steviol glycoside and one or moreUDP-sugars, under suitable reaction conditions to transfer one or moresugar moieties from the one or more UDP-sugars to the steviol, theprecursor steviol glycoside, and/or the mixture thereof, therebyproducing the target steviol glycoside or the target steviol glycosidecomposition; wherein the first 5′-UDP glycosyl transferase polypeptideis capable of converting Rebaudioside A (RebA) to Rebaudioside D (RebD)at a rate that is at least 20 times faster than the rate at which a UDPglycosyl transferase polypeptide having the amino acid sequence setforth in SEQ ID NO:5 is capable of converting RebA to RebD undercorresponding reaction conditions; and/or wherein the first 5′-UDPglycosyl transferase polypeptide is capable of converting higher amountsof RebA to RebD compared to the UDP glycosyl transferase polypeptidehaving the amino acid sequence set forth in SEQ ID NO:5 undercorresponding reaction conditions.
 2. A method of transferring a secondsugar moiety to a C2′ of the 13-O-glucose, the 19-O-glucose, or both the13-O-glucose and the 19-O-glucose of a precursor steviol glycosidehaving a 13-O-glucose, a 19-O-glucose, or both a 13-O-glucose and a19-O-glucose, comprising contacting the precursor steviol glycoside witha first uridine 5′-diphospho (UDP) glycosyl transferase polypeptidecapable of beta 1,2 glycosylation of a C2′ of the 13-O-glucose, the19-O-glucose, or both the 13-O-glucose and the 19-O-glucose of theprecursor steviol glycoside and one or more UDP-sugars, under suitablereaction conditions for the transfer of the second sugar moiety to theprecursor steviol glycoside; wherein the first 5′-UDP glycosyltransferase polypeptide is capable of converting Rebaudioside A (RebA)to Rebaudioside D (RebD) at a rate that is at least 20 times faster thanthe rate at which a UDP glycosyl transferase polypeptide having theamino acid sequence set forth in SEQ ID NO:5 is capable of convertingRebA to RebD under corresponding reaction conditions; and/or wherein thefirst 5′-UDP glycosyl transferase polypeptide is capable of convertinghigher amounts of RebA to RebD compared to the UDP glycosyl transferasepolypeptide having the amino acid sequence set forth in SEQ ID NO:5under corresponding reaction conditions.
 3. The method of claim 1,wherein the starting composition is further contacted with: (a) a second5′-UDP glycosyl transferase polypeptide capable of beta 1,2glycosylation of the C2′ of the 13-O-glucose, 19-O-glucose, or both13-O-glucose and 19-O-glucose of the precursor steviol glycoside; (b) a5′-UDP glycosyl transferase polypeptide capable of glycosylating steviolor the precursor steviol glycoside at its C-19 carboxyl group; (c) a5′-UDP glycosyl transferase polypeptide capable of glycosylating steviolor the precursor steviol glycoside at its C-13 hydroxyl group; and/or(d) a 5′-UDP glycosyl transferase polypeptide capable of beta 1,3glycosylation of the C3′ of the 13-O-glucose, 19-O-glucose, or both13-O-glucose and 19-O-glucose of the precursor steviol glycoside.
 4. Themethod of claim 1, wherein the first 5′-UDP glycosyl transferasepolypeptide is expressed by a recombinant host comprising a recombinantgene encoding the first 5′-UDP glycosyl transferase polypeptide.
 5. Themethod of claim 3, wherein at least one of the polypeptides is expressedin a recombinant host comprising one or more genes encoding the one ormore polypeptides.
 6. The method of claim 1, wherein the method is an invitro method, further comprising supplying the one or more UDP-sugarand/or a cell-free system for regeneration of the one or moreUDP-sugars.
 7. The method of claim, 6 wherein the target steviolglycoside is or the target steviol glycoside composition comprises RebD,the starting composition comprises RebA as the precursor steviolglycoside, wherein the starting composition is contacted with the first5′-UDP glycosyl transferase polypeptide in stoichiometric or excessamount.
 8. The method of claim 6, wherein the target steviol glycosideis or the target steviol glycoside composition comprises RebD, thestarting composition comprises a stevia extract having at least one ofRebA and stevioside as the precursor steviol glycoside, wherein thestarting composition is contacted with the first 5′-UDP glycosyltransferase polypeptide, a 5′-UDP glycosyl transferase polypeptidecapable of beta 1,3 glycosylation of the C3′ of the 13-O-glucose,19-O-glucose, or both 13-O-glucose and 19-O-glucose of the precursorsteviol glycoside and UDP-glucose in stoichiometric or excess amount. 9.The method of claim 6, wherein the target steviol glycoside is Reb A,RebD, rebaudioside B (RebB), steviol-1,2-bioside, stevioside,rebaudioside E (RebE), dulcoside A, rebaudioside C (RebC), rebaudiosideF (RebF), or a mixture of two or more of these compounds.
 10. The methodof claim 6, wherein the polypeptides are provided in soluble form or inimmobilized form.
 11. The method of claim 6, wherein the in vitro methodis a whole cell in vitro method, wherein the whole cells are fed rawmaterials comprising the steviol and/or the precursor steviolglycosides.
 12. The method of claim 11, wherein the whole cells are fedraw materials comprising the steviol and/or the precursor steviolglycosides derived from plant extracts.
 13. The method of claim 11,wherein the whole cell used in the whole cell in vitro method is: (a) insuspension or immobilized; (b) entrapped in a calcium or sodium alginatebead; (c) linked to a hollow fiber tube reactor system; (d) concentratedand entrapped within a membrane reactor system; or (e) in fermentationbroth or in a reaction buffer.
 14. The method of claim 11, furthercomprising permeabilizing the whole cell by using a permeabilizingagent, wherein the permeabilizing agent is a solvent, a detergent, or asurfactant, by a mechanical shock, an electroporation, or an osmoticshock.
 15. The method of claim 11, wherein the whole cell ismicroorganism being a prokaryote or a eukaryote.
 16. The method of claim11, wherein the whole cell is an Escherichia coli cell, a Saccharomycescerevisiae cell, or a Yarrowia lipolytica cell.
 17. The method of claim11, wherein the steviol is converted to RebA, RebD and/or RebE and thewhole cell is a recombinant cell expressing: (a) the first 5′-UDPglycosyl transferase polypeptide; (b) the 5′-UDP glycosyl transferasepolypeptide capable of glycosylating steviol or the precursor steviolglycoside at its C-19 carboxyl group; (c) the 5′-UDP glycosyltransferase polypeptide capable of glycosylating steviol or theprecursor steviol glycoside at its C-13 hydroxyl group; and (d) the5′-UDP glycosyl transferase polypeptide capable of beta 1,3glycosylation of the C3′ of the 13-O-glucose, 19-O-glucose, or both13-O-glucose and 19-O-glucose of the precursor steviol glycoside. 18.The method of claim 17, wherein the recombinant cell further expressesthe second 5′-UDP glycosyl transferase polypeptide.
 19. The method ofclaim 11, wherein RebA is converted to RebD and the whole cell is therecombinant cell expressing the first 5′-UDP glycosyl transferasepolypeptide.
 20. The method of claim 11, wherein rubusoside orstevioside is converted to RebD and the whole cell is the recombinantcell expressing: (a) the first 5′-UDP glycosyl transferase polypeptide;and (b) the 5′-UDP glycosyl transferase polypeptide capable of beta 1,3glycosylation of the C3′ of the 13-O-glucose, 19-O-glucose, or both13-O-glucose and 19-O-glucose of the precursor steviol glycoside. 21.The method of claim 20, wherein the recombinant cell further expressesthe second 5′-UDP glycosyl transferase polypeptide.
 22. The method ofclaim 11, wherein steviol-13-O-glucoside (13-SMG) is converted to RebDand the whole cell is the recombinant cell expressing: (a) the first5′-UDP glycosyl transferase polypeptide; (b) the 5′-UDP glycosyltransferase polypeptide capable of glycosylating steviol or theprecursor steviol glycoside at its C-19 carboxyl group; and (c) the5′-UDP glycosyl transferase polypeptide capable of beta 1,3glycosylation of the C3′ of the 13-O-glucose, 19-O-glucose, or both13-O-glucose and 19-O-glucose of the precursor steviol glycoside. 23.The method of claim 22, wherein the recombinant cell further expressesthe second 5′-UDP glycosyl transferase polypeptide.
 24. The method ofclaim 11, wherein steviol-19-O-glucoside (19-SMG) is converted to RebDand the whole cell is a recombinant cell expressing: (a) the first5′-UDP glycosyl transferase polypeptide; (b) the 5′-UDP glycosyltransferase polypeptide capable of glycosylating steviol or theprecursor steviol glycoside at its C-13 hydroxyl group; and (c) the5′-UDP glycosyl transferase polypeptide capable of beta 1,3glycosylation of the C3′ of the 13-O-glucose, 19-O-glucose, or both13-O-glucose and 19-O-glucose of the precursor steviol glycoside. 25.The method of claim 24, wherein the recombinant cell further expressesthe second 5′-UDP glycosyl transferase polypeptide.
 26. The method ofclaim 1, wherein the first 5′-UDP glycosyl transferase polypeptide hasthe amino acid sequence with at least 70% sequence identity to the aminoacid sequence set forth in SEQ ID NO:152.
 27. The method of claim 1,wherein the starting composition is further contacted with: (a) thesecond 5′-UDP glycosyl transferase polypeptide having at least 80%sequence identity to the amino acid sequence set forth in SEQ ID NO:5 orthe amino acid sequence set forth in SEQ ID NO:76 or 78; (b) the 5′-UDPglycosyl transferase polypeptide capable of glycosylating the steviol orthe precursor steviol glycoside at its C-19 carboxyl group has the aminoacid sequence having at least 90% sequence identity to the amino acidsequence set forth in SEQ ID NO:1; (c) the 5′-UDP glycosyl transferasepolypeptide capable of glycosylating the steviol or the precursorsteviol glycoside at its C-13 hydroxyl group has the amino acid sequencehaving at least 90% sequence identity to the amino acid sequence setforth in SEQ ID NO:3; and/or (d) the 5′-UDP glycosyl transferasepolypeptide capable of beta 1,3 glycosylation of the C3′ of the13-O-glucose, 19-O-glucose, or both 13-O-glucose and 19-O-glucose of theprecursor steviol glycoside has the amino acid sequence having at least90% sequence identity to the amino acid sequence set forth in SEQ IDNO:7.
 28. The method of claim 2, wherein: (a) the precursor steviolglycoside is rubusoside, the second sugar moiety is glucose, andstevioside is produced upon transfer of the second glucose moiety; (b)the precursor steviol glycoside is stevioside, the second sugar moietyis glucose, and RebE is produced upon transfer of the second glucosemoiety; and/or (c) the steviol glycoside is RebA, the second sugarmoiety is glucose and RebD is produced upon transfer of the secondglucose moiety.
 29. The method of claim 2, wherein the first 5′-UDPglycosyltransferase polypeptide has an amino acid sequence with at least70% sequence identity to the amino acid sequence set forth in SEQ IDNO:152.
 30. The method of claim 2, wherein the precursor steviolglycoside is further contacted with: (a) the second 5′-UDPglycosyltransferase polypeptide having at least 80% sequence identity tothe amino acid sequence set forth in SEQ ID NO:5 or the amino acidsequence set forth in SEQ ID NO:76 or 78; (b) the 5′-UDPglycosyltransferase polypeptide capable of glycosylating the steviol orthe precursor steviol glycoside at its C-19 carboxyl group has the aminoacid sequence having at least 90% sequence identity to the amino acidsequence set forth in SEQ ID NO:1; (c) the 5′-UDP glycosyltransferasepolypeptide capable of glycosylating the steviol or the precursorsteviol glycoside at its C-13 hydroxyl group has the amino acid sequencehaving at least 90% sequence identity to the amino acid sequence setforth in SEQ ID NO:3; and/or (d) the 5′-UDP glycosyltransferasepolypeptide capable of beta 1,3 glycosylation of the C3′ of the13-O-glucose, 19-O-glucose, or both 13-O-glucose and 19-O-glucose of theprecursor steviol glycoside has the amino acid sequence having at least90% sequence identity to the amino acid sequence set forth in SEQ IDNO:7.